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TTfrrrT Bg n* THE TI>lWn-77 RKLATED PROTEIN 

^ w&mtt.Y ANT naiefl THEREOF 

p,n> M roimd of invention 
The polypeptide cytokine interleukin-l (IL-1) 
5 is a critical mediator of inflammatory and overall immune 
response. To date, three members of the IL-1 family, 
IL-la IL-1J8 and IL-lra (Interleukin-l receptor 
antagonist) have been isolated and cloned. IL-1« and 
IL-lfi are proinflammatory cytokines which elicit 
L0 bio logical responses, whereas IL-lra is an antagonist of 
I L -1« and IL-1* activity. Two distinct cell-surface 
receptors have been identified for these ligands, the 
type 1 IL-1 receptor (IL-lRtI) and type II IL-1 receptor 
(IL-lRtII) . Recent results suggest that the IL-lRtI is 
15 the receptor responsible for transducing a signal and 
producing biological effects. 

As mentioned above, IL-1 is a key mediator of the 
h ost inflammatory response. While inflammation is an 
important homeostatic mechanism, aberrant inflammation 
20 has the potential for inducing damage to the host 

Elevated IL-1 levels are known to be associated with a 
number of diseases particularly autoimmune diseases and 

inflammatory disorders. . .. hitor of 

Since Il-lra is a naturally occurring inhibitor o£ 

25 IL-l, IL-lra can be used to limit the aberrant and 

Potentially deleterious effects of IL-1. In experimental 
animals, pretreatment with IL-lra has been shown to 
prevent death resulting from lipopolysacchande- induced 
sepsis. The relative absence of IL-lra has also been 

30 suggested to play a role in human inflammatory bowel 



disease 



g 11ir n ayy nf the invention 
The present invention is based, at least in part 
on the discovery of a gene encoding Tango-77, a secreted 
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protein that is predicted to be a member of the cytokine 
superfamily. The Tango-77 cDNA described below (SEQ ID 
NO:l) has three possible open reading frames. The first 
potential open reading frame encompasses 534 nucleotides 
extending from nucleotide 356 to nucleotide 889 of SEQ id 
NO:l (SEQ ID NO:3) and encodes a 178 amino acid protein 
(SEQ ID NO:2) . This protein may include a predicted 
signal sequence of about 63 amino acids (from about amino 
acid 1 to about amino acid 63 of SEQ ID NO: 2 (SEQ ID 
NO:4) and a predicted mature protein of about 115 amino 
acids (from about amino acid 64 to amino acid 178 of SEQ 
ID NO:2 (SEQ ID NO:5) ) . 

The second potential open reading frame 
encompasses 498 nucleotides extending from nucleotide 389 
to nucleotide 889 of SEQ ID NO: 1 (SEQ ID NO:6) and 
encodes a 167 amino acid protein (SEQ ID NO: 7) . This 
protein may include a predicted signal sequence of about 
52 amino acids (from about amino acid 1 to about amino 
acid 52 of SEQ ID N0:7 (SEQ ID NO:8)} and a predicted 
mature protein of about 115 amino acids (from about amino 
acid 52 to amino acid 167 of SEQ ID N0:7 (SEQ ID NO:9)) . 

The third potential open reading frame encompasses 
408 nucleotides extending from nucleotide 481 to 
nucleotide 889 of SEQ ID NO:l (SEQ ID NO: 10) and encodes 
a 136 amino acid protein (SEQ ID NO: 11). This protein 
includes a predicted signal sequence of about 21 amino 
acids (from about amino acid 1 to about amino acid 21 of 
SEQ id NO:ii (SEQ ID N0:12)) and a predicted mature 
protein of about 115 amino acids (from about amino acid 
22 to amino acid 136 of SEQ ID NO: 11 (SEQ ID NO:13)). 

As used herein, the terms "Tango-77", »Tango-77 
protein", "Tango-77 polypeptide" amd the like, can refer 
and polypeptide produced by the cDNA of SEQ ID N0:1 
including any and all of the Tango-77 gene products 
described above. 
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Tango-77 is expected to inhibit inflammation and 
play a functional role similar to that of secreted 
IL-lra. For example, it is expected that Tango-77 may 
bind to the IL-1 receptor, thus blocking receptor 

5 activation by inhibiting the binding of IL-1. and IL-1 0 
to the receptor. Alternatively, Tango-77 may inhibit 
inflammation through another pathway, for example, by 
binding to a novel receptor. Accordingly, Tango-77 may 
be useful as a modulating agent in regulating a variety 

10 of cellular processes including acute and chronic 

Inflammation e.g., asthma, chronic myelogenous leukemia, 
rheumatoid arthritis, psoriasis and inflammatory bowel 

disease. . , 

In one aspect, the invention provides isolated 
» nucleic acid molecules encoding Tango-77 or biologically 
active portions thereof, as well as nucleic acid 
"agents suitable as priors or hybridization probes for 
the detection of Tango-77. 

The invention encompasses methods o£ diagnosing 
and treating patients who are suffering from a disorder 
associated with an abnormal level (undesirably high or 
undesirably low) of inflection, abnormal activity of 
the IL-1 receptor complex, or abnormal activity of IL-1, 
by administering a compound that modulates the egression 
2S of Tango-77 (at the DNA, mRM* or protein level e g., by 
a tering M -plici*) or by altering the activity of 
Tango-77. Examples of such compounds include small 
molecules, antisense nucleic acid molecules, ribozymes. 

50 "* mention features a nucleic acid molecule 
30 which is at least „% (..«.. «%. - ^ or 

98 %) identical to the nucleotide sequence shown in SEQ 
„0-l, SEQ ID H0:3, SEQ ID *>■«. SEQ ID N0.10, the 
nucleotide sequence of the cDNA insert of the plasmid 
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deposited with ATCC aa iccession 

ATCC 98.07-). or a cessment thereof CDNA ° f 

The invention features a nucleic acid molecule 
which includes a freest of at least „o (e 9 25 0 
» 32S, 350, 37S, 4 00. 4 25 . 450, 500, 550 . «,„ J so y' 

o. „. „ 98S1 nucleotidea of the nucleo ;^°; e - n 

Shown » SE0 n, SE0 ID NOi3 , SEQ „ ^ ^ 

NO 10, the nucleotide sequence of the cDNA ATCC ,8007 or 
a complement thereof. 

o The invention also features a nucleic acid 

prote'n^' 111 ^ * -"odi^ a 

4 S % Is 1 : 9 " amin ° S ^ enCe ^ -t least 

45* (55% 65 %, 75%, 85%, 95 %, or 98%) identical tQ 

amxno acxd sequence of SEQ ID NO:2, SEQ ID NO • 5 SE0 in 
■ NO: 7 , SEQ ID NO:9, SEQ ID NO:U, SEQ ID NO :1 3, or ^ 
amxno acid sequence encoded by the cDNA of ATCC 98807 

In a preferred embodiment, a Tango-77 nucleic acid 
-lecule has the nucleotide sequence shown in SEQ id 
NO: , SEQ ID NO:3, SEQ ID NO: 6 , SEQ ID NO:10 or the 
nucleotide sequence of the cDNA of ATCC 98807 

Also within the invention is a nucleic acid 
-lecule which encodes a fragment of a polypeptide having 
the amxno acxd sequence of SEQ id NO:2, SEQ ID NO -4 SEO 
ID NO:5, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO :9 , SEq'id 
NO: u SEQ ID NO:I 2/ SEQ ID NO:13, wherein the fragment 
-eludes at !east 15 ( e . g ., 25 , 30 , 5Q/ * £ 

contxguous amino acids of SEQ ID NO: 2, SEQ ID NO- 4 SEO 
ID NO:5, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID 
NO:ll, SEQ ID NO:12, SEQ ID NO:13, or the polypeptide 
encoded by the cDNA of ATCC Accession Number 98807 

The invention includes a nucleic acid molecule 
whxch encodes a naturally occurring allelic variant of a 
polypeptxde comprising the amino acid sequence of SEQ id 
NO:2, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO 8 
SEQ ID NO:9, SEQ ID » sllf SEQ ID NO: 12 , SEQ ID NOa3 r 
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10 



an amino acid sequence encoded by the cDNA of ATCC 
Ussion Nu mber 98807, serein t he nucleic acid mo ecule 
hybridizes to a nucleic acid molecule composing SEQ ID 
NO-1 SEQ ID NO:3, SEQ ID NO:6, SEQ ID HO.10, or a 
complement thereof under stringent conditions 

Also within the invention are: an isolated 
Tanoo-77 protein having an amino acid sequence that is at 
least about 45%, preferably 65%, 75%, 85%, 95%, or 98 
"laical to the amino acid sequence of SEQ ID NO:5, Q 

» N0:9 or SEQ ID '-^ 
amino acid sequence of SEQ ID w.t, 
NO- 11 (immature human Tango-77) • 

Also within the invention are: an isolated 
Tango-77 protein which is encoded by a nucleic acid 
, molecule having a nucleotide sequence that a at 

about «S%. preferably 7S%. 85%. or 95% identical to SEQ 
TD NO-3, SEQ IB W>.«. SEQ ID ».10 or the cDNA of ATCC 
9 8807 and an isolated Tango-77 protein which is encoded 
Z a Nucleic acid molecule having a nucleotide sequence 
I h hvoridizes under stringent hybridization condit!ons 
" "a nucleic acd -lecule having the nucleotide sequence 
OI SEQ ID ».,. SEQ ID »«. SEQ ID ».!.. the non-coding 
strand of the cDNA of ATCC 98807, or the complement 

„ Chere0£ M so within the invention is a polypeptide which 
35 is a naturally occurring alleUc variant of a polypept de 
,h.t includes the amino acid sequence of SEQ ID NO. 2, SEQ 
ID NO 4 SEQ ID NO : S, SEQ ID NO : 7, SEQ ID NO : 8, SEQ ID 
SEQ ID .0,11. SEQ ID «,,». SEQ 
3. amino acid seance encoded by the cDNA inser t o * 
plasmid deposited with ATCC as Accession Number 98807. 
wherein the polypeptide is encoded by a nuclei a d 
molecule which hybridizes to a nuclei acid molecule 
comprising SEQ ID .0.1. SEQ ID NO:3, SEQ ID ».«. SEQ 
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NO:10 or the complement thereof under stringent 

conditions. 

Tanoo l^T <* <*• invention feature, 

s TaZ" aCid " , ° leCUleS " hiCh *««l«aiy detect 

Tango-77 nuclei acid molecule, relative to nucleic a=H 
molecules encoding other members of the cytokine 
superfamily. For example. in one embodiment, a Tango-77 
nuclei acid mdecule hybridizes under stringent 
conditions to a nucleic acid molecule comprising the 
o nucleot.de sequence of SEQ ID NO:l, SEQ ID H0:3 SEQ id 
» = «. SEQ u) »,io. the cDKA of ATCC ,6807 or a 
complement thereof, In another embodiment! the Tango-77 
nuclexc acid mol.cuie is at least 300 (325, 350 3,f 
400 4 25 , 450. 500, 550, 600. 650. 700, 8 00, S 00. „ 
nucleotides in length and hybridize, under stringent 
conations to a nucleic acid molecule comprising the 
nucleotide sequence sho»n in SEQ ID *>:l, SEQ id mo-3 
SE, ID NO,6. SEQ ID » !l0 , the cDHA of ATCC S8807. or'a 
complement thereof. In y et another embodiment, the 
invention provides an isoUted nucleic acid molecule 
»h lc h ls antisens. to the coding strand of a Tango-77 
nucleic acid. a 

Another aspect of the invention provides a vector 
e.g., a recombinant expression vector, comprising a 
Tango-77 nucleic acid molecule of the invention, m 
another embodiment, the invention provides a host cell 
containing such a vector. The invention also provides a 
method for producing Tango-77 protein by culturing, in a 
suable medium, a host cell of the invention confining 
a recombinant expression vector such that a Tango-77 
protein is produced. 

Another aspect of this invention features isolated 
or recombinant Tango-77 proteins and polypeptides. 
Preferred Tango-77 proteins and polypeptides possess at 
least one biological activity possessed by naturally 
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occurring human Tango-77, e.g., (i) the ability to 
interact with proteins in the Tango-77 signalling pathway 
(ii) the ability to interact with a Tango-77 ligand or 
receptor; or (iii) the ability to interact with an 
5 intracellular target protein, (iv) the ability to 

interact with a protein involved in inflammation and (v) 
the ability to bind the IL-1 receptor. Other activities 
include the induction and suppression of polypeptide 
interleukins, cytokines and growth factors. 
10 The Tango-77 proteins of the present invention, or 

biologically active portions thereof, can be operably 
linked to a non-Tango-77 polypeptide (e.g., heterologous 
amino acid sequences) to form Tango-77 fusion proteins. 
The invention further features antibodies that 
is specifically bind Tango-77 proteins, such as monoclonal 
or polyclonal antibodies. In addition, the Tango-77 
proteins or biologically active portions thereof can be 
incorporated into pharmaceutical compositions, which 
optionally include pharmaceutical ly acceptable carriers. 

20 in another aspect, the present invention provides 

a method for detecting the presence of Tango-77 activity 
or expression in a biological sample by contacting the 
biological sample with an agent capable of detecting an 
indicator of Tango-77 activity or expression such that 
25 the presence of Tango-77 activity or expression is 
detected in the biological sample. 

in another aspect, the invention provides a method 
for modulating Tango-77 activity comprising contacting a 
cell with an agent that modulates (inhibits or 

30 stimulates) 

Tango-77 activity or expression such that Tango-77 
activity or expression in the cell is modulated. In one 
embodiment, the agent is an antibody that specifically 
binds to Tango-77 protein. In another embodiment, the 
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agent modulates expression of Tango-77 by modulating 
transcription of a Tango-77 gene, splicing of a Tango-77 
mRNA, or translation of a Tango-77 mRNA. In yet another 
embodiment, the agent is a nucleic acid molecule having a 
nucleotide sequence that is antisense to the coding 
strand of the Tango-77 mRNA or the Tango-77 gene. 

In one embodiment, the methods of the present 
invention are used to treat a subject having a disorder 
characterized by aberrant Tango-77 protein activity or 
nucleic acid expression by administering an agent which 
is a Tango-77 modulator to the subject, in one 
embodiment, the Tango-77 modulator is a Tango-77 protein. 
In another embodiment, the Tango-77 modulator is a 
Tango-77 nucleic acid molecule. In other embodiments, 
the Tango-77 modulator is a peptide, peptidomimetic, or 
other small molecule. In a preferred embodiment, the 
disorder characterized by aberrant Tango-77 protein or 
nucleic acid expression can include chronic and acute 
inflammation. 

The present invention also provides a diagnostic 
assay for identifying the presence or absence of a 
genetic lesion or mutation characterized by at least one 
of: (i) aberrant modification or mutation of a gene 
encoding a Tango-77 protein; (ii) mis -regulation of a 
gene encoding a Tango-77 protein; and (iii) aberrant 
post-translational modification of a Tango-77 protein, 
wherein a wild-type form of the gene encodes a protein 
with a Tango-77 activity. 

In another aspect, the invention provides a 
method for identifying a compound that binds to or 
modulates the activity of a Tango-77 protein. In 
general, such methods entail measuring a biological 
activity of a Tango-77 protein in the presence and 
absence of a test compound and identifying those 
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compounds which alter the activity of the Tango-77 
protein. 

The invention also features methods for 
identifying a compound which modulates the expression of 
5 Tango-77 by measuring the expression of Tango-77 in the 
presence and absence of a compound. 

Other features and advantages of the invention 
will be apparent from the following detailed description 
and claims. 

L0 nr-iPf Description of Drawings 

Figure 1 depicts the cDNA sequence (SEQ ID N0:1) 
of Tango-77. The Tango-77 cDNA has three possible open 
reading frames which encode the amino acid sequence (SEQ 
ID NO-2, SEQ ID NO:7 and SEQ ID NO:ll) of human Tango-77. 

is The three potential open reading frames of SEQ ID NO 

extend from: (1) nucleotide 356 to nucleotide 889 (SEQ ID 
NO-3); (2) nucleotide 389 to nucleotide 889 (SEQ ID 
NO:6); and (3) nucleotide 481 to nucleotide 889 (SEQ ID 

NO:10) . . 
20 Figure 2 depicts an alignment of an amino acid 

sequence of Tango-77 (T77; SEQ ID NO:2) with IL-1RA (SEQ 
ID NO: 14) , and IL-1J8 (SEQ ID NO: 15) . 

Figure 3 depicts the genomic sequence of BACl (SEQ 

ID NO: 16) . 

25 Figure 4 depicts the genomic sequence of BAC2 (SEQ 

ID NO: 17) . 

Figure 5 depicts an amino acid sequence of an 
alternatively spliced form of Tango-77 (SEQ ID NO: 2) as 
predicted by Procrustes (T77-procrustes; SEQ ID NO:18) . 
30 Figure 6 depicts an alignment of an amino acid 

sequence of an alternatively spliced form of Tango-77 
(T77-procrustes; SEQ ID NO: 18) with Tango-77 (SEQ ID 
NO:2) . 
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Figure 7 depicts an alignment of an amino acid 
sequence of an alternatively spliced form of Tango- 77 
(T77 -procrustes; SEQ IDNO:18) with IL-lra (SEQ ID 
NO:14), and IL-10 (SEQ ID NO:15). 

Detailed Description nf «. h e i nvpnHn[ 1 
The present invention is based on the discovery of 
a cDNA molecule encoding human Tango-77, a member of the 
cytokine superfamily. The cDNA molecule encoding human 
Tango-77 has three possible open reading frames. The 
three possible nucleotide open reading frames for human 
Tango-77 protein are shown in Figure 1 (SEQ ID NO: 3, SEQ 
ID NO:6 and SEQ ID NO:10). The predicted amino acid 
sequence for the three possible Tango-77 immature 
proteins are also shown in 

Figure 1 (SEQ ID NO:2, SEQ ID NO:7 or SEQ ID NO:ll) and 
three possible mature proteins are also shown in Figure 1 
(SEQ ID NO:5, SEQ ID NO:9 and SEQ ID NO:13) . 

The Tango-77 cDNA of Figure 1 (SEQ ID N0:1), which 
is approximately 989 nucleotides long including 
untranslated regions, encodes a protein amino acid having 
a molecular weight of approximately 19 kDa, 18 kDa, or 
14.9 KDa (excluding post-translational modifications) and 
the possible mature form of the protein has a molecular 
weight of 13 kDa. A plasmid containing a cDNA encoding 
human Tango-77 (with the cDNA insert name of Of fthx077) 
was deposited with American Type Culture Collection 
(ATCC), 10801 University Boulevard, Manassas, Virginia 
20110-2209 on July 2, 1998 and assigned Accession Number 
98807. This deposit will be maintained under the terms 
of the Budapest Treaty on the International Recognition 
of the Deposit of Microorganisms for the Purposes of 
Patent Procedure. This deposit was made merely as a 
convenience for those of skill in the art and is not an 
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admission that a deposit is required under 35 U.S.C. 
§112 . 

Human Tango-77 is one member of a family of 
molecules (the "Tango-77 family") having certain 
5 conserved structural and functional features. The term 
"family," when referring to the protein and nucleic acid 
molecules of the invention, is intended to mean two or 
more proteins or nucleic acid molecules having a common 
structural domain and having sufficient amino acid or 

10 nucleotide sequence identity as defined herein. Such 

family members can be naturally occurring and can be from 
either the same or different species. For example, a 
family can contain a first protein of human origin and a 
homologue of that protein of murine origin, as well as a 

is second, distinct protein of human. origin and a murine 
homologue of that protein. Members of a family may also 
have common functional characteristics. 

As used interchangeably herein a "Tango-77 
activity" , "biological activity of Tango-77" or 

20 "functional activity of Tango-77", refers to an activity 
exerted by a Tango-77 protein, polypeptide or nucleic 
acid molecule on a Tango-77 responsive cell as determined 
in vivo, or in vitro, according to standard techniques. 
A Tango-77 activity can be a direct activity, such as an 

25 association with a second protein, or an indirect 

activity, such as a cellular signaling activity mediated 
by interaction of the Tango-77 protein with a second 
protein. In a preferred embodiment, a Tango-77 activity 
includes at least one or more of the following 

30 activities: (i) the ability to interact with proteins in 
the Tango-77 signalling pathway (ii) the ability to 
interact with a Tango-77 ligand or receptor; or (iii) the 
ability to interact with an intracellular target protein, 
(iv) the ability to interact with a protein involved in 
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inflammation, and (v) the ability to bind the IL-1 
receptor. 

Accordingly, another embodiment of the invention 
features isolated Tango-77 proteins and polypeptides 
5 having a Tango-77 activity. 

Yet another embodiment of the invention features 
Tango-77 molecules which contain a signal sequence. 
Generally, a signal sequence (or signal peptide) is a 
peptide containing about 21 to 63 amino acids which 
io occurs at the extreme N-terminal end of a secretory 
protein. The native Tango-77 signal sequence (SEQ ID 
NO: 4, SEQ ID NO: 8, or SEQ ID NO: 12) can be removed and 
replaced with a signal sequence from another protein. In 
certain host cells (e.g., mammalian host cells), 
is expression and/or secretion of Tango-77 can be increased 
through use of a heterologous signal sequence. For 
example, the gp67 secretory sequence of the baculovirus 
envelope protein can be used as a heterologous signal 
sequence. Alternatively, the native Tango-77 signal 
20 sequence can itself be used as a heterologous signal 
sequence in expression systems, e.g., to facilitate the 
secretion of a protein of interest. 

Various aspects of the invention are described in 
further detail in the following subsections. 

25 i . Isolated Nucleic Acid Molecules 

One aspect of the invention pertains to isolated 
nucleic acid molecules that encode Tango-77 proteins or 
biologically active portions thereof, as well as nucleic 
acid molecules sufficient for use as hybridization probes 

30 to identify Tango- 77 -encoding nucleic acids (e.g., 

Tango-77 mRNA) and fragments for use as PCR primers for 
the amplification or mutation of Tango-77 nucleic acid 
molecules. As used herein, the term "nucleic acid 
molecule" is intended to include DNA molecules (e.g., 
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cDNA or genomic DNA) and RNA molecules (e.g., mRNA) and 
analogs of the DNA or RNA generated using nucleotide 
analogs. The nucleic acid molecule can be single- 
stranded or double-stranded, but preferably is double- 
5 stranded DNA. 

An "isolated" nucleic acid molecule is one which 
is separated from other nucleic acid molecules which are 
present in the natural source of the nucleic acid. 
Preferably, an "isolated" nucleic acid is free of 

10 sequences (preferably protein encoding sequences) which 
naturally flank the nucleic acid (i.e., sequences located 
at the 5' and 3' ends of the nucleic acid) in the genomic 
DNA of the organism from which the nucleic acid is 
derived. For example, in various embodiments, the 

is isolated Tango-77 nucleic acid molecule can contain less 
than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb 
of nucleotide sequences which naturally flank the nucleic 
acid molecule in genomic DNA of the cell from which the 
nucleic acid is derived. Moreover, an "isolated" nucleic 

20 acid molecule, such as a cDNA molecule, can be 

substantially free of other cellular material, or culture 
medium when produced by recombinant techniques, or 
substantially free of chemical precursors or other 
chemicals when chemically synthesized. 

25 A nucleic acid molecule of the present invention, 

e.g., a nucleic acid molecule having the nucleotide 
sequence of SEQ ID N0:1, SEQ ID NO: 3, SEQ ID NO: 6, SEQ ID 
NO: 10, the cDNA of ATCC 98807, or a complement of any of 
these nucleotide sequences, can be isolated using 

30 standard molecular biology techniques and the sequence 
information provided herein. Using all or a portion of 
the nucleic acid sequences of SEQ ID NO:l, SEQ ID NO: 3, 
SEQ ID NO: 6, SEQ ID NO: 10, the cDNA of ATCC 98807, or the 
complement thereof as a hybridization probe, Tango-77 

35 nucleic acid molecules can be isolated using standard 
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hybridization and cloning techniques (e.g., as described 
in Sambrook et al . , eds., Molecular Cloning: A 
Laboratory Manual, 2nd ed. , Cold Spring Harbor 
Laboratory, Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, NY, 1989) . 

A nucleic acid of the invention can be amplified 
using cDNA, mRNA or genomic DNA as a template and 
appropriate oligonucleotide primers according to standard 
PCR amplification techniques. The nucleic acid so 
o amplified can be cloned into an appropriate vector and 
characterized by DNA sequence analysis. Furthermore, 
oligonucleotides corresponding to Tango- 77 nucleotide 
sequences can be prepared by standard synthetic 
techniques, e.g., using an automated DNA synthesizer. 
5 In another preferred embodiment, an isolated 

nucleic acid molecule of the invention comprises a 
nucleic acid molecule which is a complement of the 
nucleotide sequence shown in SEQ ID N0:1, SEQ ID NO:3, 
SEQ ID NO: 6, SEQ ID NO: 10 the cDNA of ATCC 98807, or a 
o portion thereof. A nucleic acid molecule which is 

complementary to a given nucleotide sequence is one which 
is sufficiently complementary to the given nucleotide 
sequence that it can hybridize to the given nucleotide 
sequence thereby forming a stable duplex. 
5 Moreover, the nucleic acid molecule of the 

invention can comprise only a portion of a nucleic acid 
sequence encoding Tango-77, for example, a fragment which 
can be used as a probe or primer or a fragment encoding a 
biologically active portion of Tango-77. The nucleotide 
o sequence determined from the cloning of the human 
Tango-77 gene allows for the generation of probes and 
primers designed for use in identifying and/or cloning 
Tango-77 homologues in other cell types, e.g., from other 
tissues, as well as Tango-77 homologues from other 
35 mammals. The probe/primer typically comprises 
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substantially purified oligonucleotide. The 
oligonucleotide typically comprises a region of 
nucleotide sequence that hybridizes under stringent 
conditions to at least about 12, preferably about 25, 
5 more preferably about 50, 75, 100, 125, 150, 175, 200, 
250, 300, 350 or 400 consecutive nucleotides of the sense 
or ant i- sense sequence of SEQ ID N0:1, SEQ ID NO: 3, SEQ 
ID NO: 6, SEQ ID NO: 10, or the cDNA of ATCC 98807. 
Alternatively, the oligonucleotide can typically comprise 

10 a region of nucleotide sequence that hybridizes under 
stringent conditions to at least about 12, preferably 
about 25, more preferably about 50, 75, 100, 125, 150, 
175, 200, 250, 300, 350 or 400 consecutive nucleotides of 
the sense or ant i- sense sequence of a naturally occurring 

15 mutant of SEQ ID N0:1, SEQ ID NO: 3, SEQ ID NO: 6, SEQ ID 
NO: 10, or the cDNA of ATCC 98807. 

Probes based on the human Tango- 77 nucleotide 
sequence can be used to detect transcripts or genomic 
sequences encoding the same or identical proteins. The 

20 probe comprises a label group attached thereto, e.g., a 
radioisotope, a fluorescent compound, an enzyme, or an 
enzyme co- factor. Such probes can be used as a part of a 
diagnostic test kit for identifying cells or tissues 
which mis-express a Tango-77 protein, such as by 

25 measuring a level of a Tango- 77 -encoding nucleic acid in 
a sample of cells from a subject, e.g., detecting 
Tango-77 mRNA levels or determining whether a genomic 
Tango-77 gene has been mutated or deleted. 

A nucleic acid fragment encoding a "biologically 

30 active portion of Tango-77" can be prepared by isolating 
a portion of SEQ ID N0:1, SEQ ID NO: 3, SEQ ID NO: 6, SEQ 
ID NO: 10 or the nucleotide sequence of the cDNA of ATCC 
98807 which encodes a polypeptide having a Tango-77 
biological activity, expressing the encoded portion of 

35 Tango-77 protein (e.g., by recombinant expression in 
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vitro) and assessing the activity of the encoded portion 
of Tango-77. 

The invention further encompasses nucleic acid 
molecules that differ from the nucleotide sequence of SEQ 

5 ID N0:1, SEQ ID NO: 3, SEQ ID NO: 6, SEQ ID NO: 10, or the 
cDNA of ATCC 98807 due to degeneracy of the genetic code 
and thus encode the same Tango-77 protein as that encoded 
by the nucleotide sequence shown in SEQ ID NO:l, SEQ ID 
NO: 3, SEQ ID NO: 6, SEQ ID NO: 10, or the cDNA of ATCC 

10 98807. 

In addition to the human Tango-77 nucleotide 
sequence shown in SEQ ID NO:l, SEQ ID NO: 3, SEQ ID NO: 6, 
SEQ ID NO: 10, or the cDNA of ATCC 98807, it will be 
appreciated by those skilled in the art that DNA sequence 
is polymorphisms that lead to changes in the amino acid 
sequences of Tango-77 may exist within a population 
(e.g., the human population). Such genetic polymorphism 
in the Tango-77 gene may exist among individuals within a 
population due to natural allelic variation. An allele 
20 is one of a group of genes which occur alternatively at a 
given genetic locus. As used herein, the term "allelic 
variant 11 refers to a nucleotide sequence which occurs at 
a Tango-77 locus or to a polypeptide encoded by the 
nucleotide sequence. As used herein, the terms "gene" 
25 and "recombinant gene" refer to nucleic acid molecules 
comprising an open reading frame encoding a Tango-77 
protein, preferably a mammalian Tango-77 protein. Such 
natural allelic variations can typically result in 1-5% 
variance in the nucleotide sequence of the Tango-77 gene. 

30 Alternative alleles can be identified by sequencing the 
gene of interest in a number of different individuals. 
This can be readily carried out by using hybridization 
probes to identify the same genetic locus in a variety of 
individuals. Any and all such nucleotide variations and 

35 resulting amino acid polymorphisms or variations in 
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Tango- 77 that are the result of natural allelic variation 
and that do not alter the functional activity of Tango-77 
are intended to be within the scope of the invention. 

Moreover, nucleic acid molecules encoding Tango-77 
5 proteins from other species (Tango-77 homologues) , which 
have a nucleotide sequence which differs from that of a 
human Tango-77, are intended to be within the scope of 
the invention. Nucleic acid molecules corresponding to 
natural allelic variants and homologues of the Tango-77 

10 cDNA of the invention can be isolated based on their 
identity to the human Tango-77 nucleic acids disclosed 
herein using the human cDNAs, or a portion thereof, as a 
hybridization probe according to standard hybridization 
techniques under stringent hybridization conditions. 

15 Accordingly, in another embodiment, an isolated 

nucleic acid molecule of the invention is at least 300 
(325, 350, 375, 400, 425, 450, 500, 550, 600, 650, 700, 
800, or 989) nucleotides in length and hybridizes under 
stringent conditions to the nucleic acid molecule 

20 comprising the nucleotide sequence, preferably the coding 
sequence, of SEQ ID N0:1, SEQ ID N0:3, SEQ ID NO: 6, SEQ 
ID NO: 10, or the cDNA of ATCC 98807. 

As used herein, the term "hybridizes under 
stringent conditions" is intended to describe conditions 

25 for hybridization and washing under which nucleotide 
sequences at least 60% (65%, 70%, preferably 75%) 
identical to each other typically remain hybridized to 
each other. Such stringent conditions are known to those 
skilled in the art and can be found in Current Protocols 

30 in Molecular Biology, John Wiley & Sons, N.Y. (1989), 
6.3.1-6.3.6. A preferred, non-limiting example of 
stringent hybridization conditions are hybridization in 
6X sodium chloride/sodium citrate (SSC) at about 45°C, 
followed by one or more washes in 0.2X SSC, 0.1% SDS at 

35 50-65°C. Preferably, an isolated nucleic acid molecule 
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of the invention that hybridizes under stringent 
conditions to the sequence of SEQ ID NO:l, SEQ ID NO; 3, 
SEQ ID NO: 6, SEQ ID NO: 10, the cDNA of ATCC 98807, or the 
complement thereof, corresponds to a naturally-occurring 
s nucleic acid molecule. As used herein, a "naturally- 
occurring" nucleic acid molecule refers to an RNA or DNA 
molecule having a nucleotide sequence that occurs in 
nature (e.g., encodes a natural protein). 

In addition to naturally-occurring allelic 
10 variants of the Tango- 77 sequence that may exist in the 
population, the skilled artisan will further appreciate 
that changes can be introduced by mutation into the 
nucleotide sequence of SEQ ID NO:l, SEQ ID NO: 3, SEQ ID 
NO: 6, SEQ ID NO: 10 or the cDNA of ATCC 98807, thereby 
is leading to changes in the amino acid sequence of the 

encoded Tango-77 protein, without altering the biological 
activity of the Tango-77 protein. Amino acid residues 
that are not conserved or only semiconserved among Tango- 
77 of various species may be non-essential for activity 
20 and thus would likely be targets for alteration. 

Alternatively, one can make nucleotide substi tut ions 
leading to amino acid substitutions at "non-essential" 
amino acid residues. A "non-essential" amino acid 
residue is a residue that can be altered from the wild- 
25 type sequence of Tango-77 (e.g., the sequence of SEQ ID 
NO:2, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:ll 
or SEQ ID NO: 13) without altering the biological 
activity, whereas an "essential" amino acid residue is 
required for biological activity. For example, amino 
30 acid residues that are conserved among the Tango-77 

proteins of various species may be essential for activity 
and thus would not likely be targets for alteration, 
unless one wishes to reduce or alter Tango-77 activity. 
Accordingly, another aspect of the invention 
35 pertains to nucleic acid molecules encoding Tango-77 
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proteins that contain changes in amino acid residues that 
are not essential for activity. Such Tango-77 proteins 
differ in amino acid sequence from SEQ ID NO; 2, SEQ ID 
NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID 110:11, or SEQ ID 
5 NO: 13 yet retain biological activity. In one embodiment, 
the isolated nucleic acid molecule includes a nucleotide 
sequence encoding a protein that includes an amino acid 
sequence that is at least about 45% identical, 65%, 75%, 
85%, 95%, or 98% identical to the amino acid sequence of 

10 SEQ ID NO:2, SEQ ID NO:5, SEQ ID N0:7, SEQ ID NO:9, SEQ 
ID NO: 11, or SEQ ID NO: 13. 

An isolated nucleic acid molecule encoding a 
Tango-77 protein having a sequence which differs from 
that of SEQ ID N0:2, SEQ ID N0:5, SEQ ID NO:7, SEQ ID 

is NO: 9, SEQ ID NO: 11, or SEQ ID NO: 13 can be created by 
introducing one or more nucleotide substitutions, 
additions or deletions into the nucleotide sequence of 
SEQ ID NO:l, SEQ ID N0:3, SEQ ID N0:6, SEQ ID NO: 10, or 
the cDNA of ATCC 98807 such that one or more amino acid 

20 substitutions, additions or deletions are introduced into 
the encoded protein. Mutations can be introduced by 
standard techniques, such as site-directed mutagenesis 
and PCR-mediated mutagenesis. Preferably, conservative 
amino acid substitutions are made at one or more 

25 predicted non-essential amino acid residues. A 

"conservative amino acid substitution" is one in which 
the amino acid residue is replaced with an amino acid 
residue having a similar side chain. Families of amino 
acid residues having similar side chains have been 

30 defined in the art. These families include amino acids 
with basic side chains (e.g., lysine, arginine, 
histidine) , acidic side chains (e.g., aspartic acid, 
glutamic acid), uncharged polar side chains (e.g., 
glycine, asparagine, glut amine, serine, threonine, 

35 tyrosine, cysteine), nonpolar side chains (e.g., alanine, 
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valine, leucine, isoleucine, proline, phenylalanine, 
methionine, tryptophan), beta-branched side chains (e.g., 
threonine, valine, isoleucine) and aromatic side chains 
(e.g., tyrosine, phenylalanine, tryptophan, histidine) . 
5 Thus, a predicted nonessential amino acid residue in 
Tango-77 is preferably replaced with another amino acid 
residue from the same side chain family. Alternatively, 
mutations can be introduced randomly along all or part of 
a Tango-77 coding sequence, such as by saturation 
io mutagenesis, and the resultant mutants can be screened 
for Tango-77 biological activity to identify mutants that 
retain activity. Following mutagenesis, the encoded 
protein can be expressed recombinantly and the activity 
of the protein can be determined. 
15 in a preferred embodiment, a mutant Tango-77 

protein can be assayed for: (1) the ability to form 
protein: protein interactions with proteins in the 
Tango-77 signalling pathway; (2) the ability to bind a 
Tango-77 ligand or receptor; or (3) the ability to bind 
20 to an intracellular target protein or (4) the ability to 
interact with a protein involved in inflammation or (5) 
the ability to bind the IL-1 receptor. In yet another 
preferred embodiment, a mutant Tango-77 can be assayed 
for the ability to modulate inflammation, asthma, 
25 autoimmune dieseases, and sepsis. 

The present invention encompasses antisense 
nucleic acid molecules, i.e., molecules which are 
complementary to a sense nucleic acid encoding a protein, 
e.g., complementary to the coding strand of a double- 
30 stranded cDNA molecule or complementary to an mRNA 
sequence. Accordingly, an antisense nucleic acid can 
hydrogen bond to a sense nucleic acid. The antisense 
nucleic acid can be complementary to an entire Tango-77 
coding strand, or to only a portion thereof, e.g., all or 
35 part of the protein coding region (or open reading 
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frame) . An antisense nucleic acid molecule can be 
antisense to a noncoding region of the coding strand of a 
nucleotide sequence encoding Tango-77. The noncoding 
regions ("5' and 3' untranslated regions") are the 5' and 
5 3' sequences which flank the coding region and are not 
translated into amino acids. 

Given the coding strand sequences encoding 
Tango-77 disclosed herein (e.g., SEQ ID NO:3, SEQ ID NO: 5, 
or SEQ ID NO:8), antisense nucleic acids of the invention 

10 can be designed according to the rules of Watson and 

Crick base pairing. The antisense nucleic acid molecule 
can be complementary to the entire coding region of 
Tango-77 mRNA, but more preferably is an oligonucleotide 
which is antisense to only a portion of the coding or 

is noncoding region of Tango-77 mRNA, For example, the 
antisense oligonucleotide can be complementary to the 
region surrounding the translation start site of Tango-77 
mRNA, e.g., an oligonucleotide having the sequence 
5 ' - TGCAACTTTTACAGGAAACAC- 3 ' (SEQ ID NO: 19) or 

20 5' -CCTCACTTTTACCCGAGACTC-3 ' (SEQ ID NO:20) or 

5 ' -GACGGGTGGTACTTAAAACAA-3 ' (SEQ ID NO:21) . An antisense 
oligonucleotide can be, for example, about 5, 10, 15, 20, 
25, 30, 35, 40, 45 or 50 nucleotides in length. An 
antisense nucleic acid of the invention can be 

25 constructed using chemical synthesis and enzymatic 
ligation reactions using procedures known in the art. 
For example, an antisense nucleic acid (e.g., an 
antisense oligonucleotide) can be chemically synthesized 
using naturally occurring nucleotides or variously 

30 modified nucleotides designed to increase the biological 
stability of the molecules or to increase the physical 
stability of the duplex formed between the antisense and 
sense nucleic acids, e.g., phosphorothioate derivatives 
and acridine substituted nucleotides can be used. 

35 Examples of modified nucleotides which can be used to 
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generate the antisense nucleic acid include 5- 
f luorouracil , 5-brotnouracil , 5-chlorouracil , 
5-iodouracil, hypoxanthine, xanthine, 4-acetylcytosine, 
5- (carboxyhydroxylmethyl) uracil, 5- 
5 c a r boxyme t hy 1 ami nome t hy 1 - 2 - 1 hi our i d i ne , 

5-carboxymethylaminomethyluracil , dihydrouracil , 
beta-D-galactosylqueosine, inosine, N6- 
isopentenyladenine, l-methylguanine f 1 -methyl inosine, 
2, 2-dimethylguanine, 2-methyladenine, 2-methylguanine, 3- 
io methylcytosine, 5-methylcytosine , N6 -adenine, 7- 
methylguanine, 5-methylaminomethyluracil, 5- 
methoxyaminomethyl-2-thiouracil, beta-D-mannosylqueosine, 
5' -methoxycarboxymethyluracil, 5-methoxyuracil, 2- 
methylthio-N6-isopentenyladenine, uracil- 5 -oxyacetic acid 
is (v) , wybutoxosine, pseudouracil , queosine, 2- 

thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil , 4- 
thiouracil, 5-methyluracil, uracil- 5 -oxyacetic acid 
methylester, uracil -5 -oxyacetic acid (v) , 5-methyl-2- 
thiouracil, 3- (3-amino-3-N-2-carboxypropyl) uracil 
20 (acp3)w, and 2 , 6-diaminopurine . Alternatively, the 

antisense nucleic acid can be produced biologically using 
an expression vector into which a nucleic acid has been 
subcloned in an antisense orientation (i.e., RNA 
transcribed from the inserted nucleic acid will be of an 
25 antisense orientation to a target nucleic acid of 

interest, described further in the following subsection). 

The antisense nucleic acid molecules of the 
invention are typically administered to a subject or 
generated in situ such that they hybridize with or bind 
30 to cellular mRNA and/or genomic DNA encoding a Tango- 77 
protein to thereby inhibit expression of the protein, 
e.g., by inhibiting transcription and/or translation. 
The hybridization can be by conventional nucleotide 
complementarity to form a stable duplex, or, for example, 
35 in the case of an antisense nucleic acid molecule which 
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binds to DNA duplexes, through specific interactions in 
the major groove of the double helix. An example of a 
route of administration of antisense nucleic acid 
molecules of the invention includes direct injection at a 
tissue site. Alternatively, antisense nucleic acid 
molecules can be modified to target selected cells and 
then administered systemically . For example, for 
systemic administration, antisense molecules can be 
modified such that they specifically bind to receptors or 
antigens expressed on a selected cell surface, e.g., by 
linking the antisense nucleic acid molecules to peptides 
or antibodies which bind to cell surface receptors or 
antigens. The antisense nucleic acid molecules can also 
be delivered to cells using the vectors described herein. 
To achieve sufficient intracellular concentrations of the 
antisense molecules, vector constructs in which the 
antisense nucleic acid molecule is placed under the 
control of a strong pol II or pol III promoter are 
preferred. 

An antisense nucleic acid molecule of the 
invention can be an a-anomeric nucleic acid molecule. An 
a-anomeric nucleic acid molecule forms specific double- 
stranded hybrids with complementary RNA in which, 
contrary to the usual /3-units, the strands run parallel 
to each other (Gaultier et al. (1987) Nucleic Acids Res. 
15:6625-6641). The antisense nucleic acid molecule can 
also comprise a 2 ' -o-methylribonucleotide (Inoue et al . 
(1987) Nucleic Acids Res. 15:6131-6148) or a chimeric 
RNA -DNA analogue (Inoue et al. (1987) FEBS Lett. 215:327- 
330) . 

The invention also encompasses ribozymes. 
Ribozymes are catalytic RNA molecules with ribonuclease 
activity which are capable of cleaving a single -stranded 
nucleic acid, such as an mRNA, to which they have a 
complementary region. Thus, ribozymes (e.g., hammerhead 
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ribozymes (described in Haselhoff and Gerlach (1988) 
Nature 334:585-591)) can be used to catalytically cleave 
Tango-77 mRNA transcripts to thereby inhibit translation 
of Tango-77 mRNA. A ribozyme having specificity for a 
5 Tango-77 -encoding nucleic acid can be designed based upon 
the nucleotide sequence of a Tango-77 cDNA disclosed 
herein (e.g., SEQ ID NO:l, SEQ ID NO:3, SEQ ID 1*0:6, SEQ 
ID NO: 10) . For example, a derivative of a Tetrahymena L- 
19 IVS RNA can be constructed in which the nucleotide 
io sequence of the active site is complementary to the 

nucleotide sequence to be cleaved in a Tango- 77 -encoding 
mRNA. See, e.g., Cech et al. U.S. Patent No. 4,987,071; 
and Cech et al. U.S. Patent No. 5,116,742. 
Alternatively, Tango-77 mRNA can be used to select a 
is catalytic RNA having a specific ribonuclease activity 
from a pool of RNA molecules. See, e.g., Bartel and 
Szostak (1993) Science 261:1411-1418. 

The invention also encompasses nucleic acid 
molecules which form triple helical structures. For 
20 example, Tango-77 gene expression can be inhibited by 
targeting nucleotide sequences complementary to the 
regulatory region of the Tango-77 (e.g., the Tango-77 
promoter and/or enhancers) to form triple helical 
structures that prevent transcription of the Tango-77 
25 gene in target cells. See generally, Helene (1991) 

Anticancer Drug Des. 6 (6) : 569-84 ; Helene (1992) Ann. N.Y. 
Acad. Sci. 660:27-36; and Maher (1992) Bioassays 

14 (12) -.807-15. 

In preferred embodiments, the nucleic acid 

30 molecules of the invention can be modified at the base 
moiety, sugar moiety or phosphate backbone to improve, 
e.g., the stability, hybridization, or solubility of the 
molecule. For example, the deoxyribose phosphate 
backbone of the nucleic acids can be modified to generate 

35 peptide nucleic acids (see Hyrup et al. (1996) Bioorganic 



WO 99/06426 



PCT/US98/16102 



- 25 - 

& Medicinal Chemistry 4(1) : 5-23) . As used herein, the 
terms "peptide nucleic acids" or "PNAs" refer to nucleic 
acid mimics, e.g., DNA mimics, in which the deoxyribose 
phosphate backbone is replaced by a pseudopeptide 
5 backbone and only the four natural nucleobases are 

retained. The neutral backbone of PNAs has been shown to 
allow for specific hybridization to DNA and RNA under 
conditions of low ionic strength. The synthesis of PNA 
oligomers can be performed using standard solid phase 

10 peptide synthesis protocols as described in Hyrup et al . 
(1996) supra; Perry-0' Keef e et al. (1996) Proc. Natl. 
Acad. Sci. USA 93: 14670-675. 

PNAs of Tango-77 can be used in therapeutic and 
diagnostic applications. For example, PNAs can be used 

is as antisense or antigene agents for sequence-specific 
modulation of gene expression by, e.g., inducing 
transcription or translation arrest or inhibiting 
replication. PNAs of Tango-77 can also be used, e.g., in 
the analysis of single base pair mutations in a gene by, 

20 e.g., PNA directed PCR clamping; as artificial 

restriction enzymes when used in combination with other 
enzymes, e.g., SI nucleases (Hyrup (1996) supra; or as 
probes or primers for DNA sequence and hybridization 
(Hyrup (1996) supra; Perry-0' Keef e et al . (1996) Proc. 

25 Natl. Acad. Sci . USA 93: 14670-675). 

In another embodiment, PNAs of Tango-77 can be 
modified, e.g., to enhance their stability or cellular 
uptake, by attaching lipophilic or other helper groups to 
PNA, by the formation of PNA -DNA chimeras, or by the use 

3 0 of liposomes or other techniques of drug delivery known 
in the art. For example, PNA- DNA chimeras of Tango-77 
can be generated which may combine the advantageous 
properties of PNA and DNA. Such chimeras allow DNA 
recognition enzymes, e.g., RNAse H and DNA polymerases, 

35 to interact with the DNA portion while the PNA portion 
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would provide high binding affinity and specificity. 
PNA-DNA chimeras can be linked using linkers of 
appropriate lengths selected in terms of base stacking, 
number of bonds between the nucleobases, and orientation 
5 (Hyrup (1996) supra) . The synthesis of PNA-DNA chimeras 
can be performed as described in Hyrup (1996) supra and 
Finn et al . (1996) Nucleic Acids Res. 24 (17) : 3357-63 . 
For example, a DNA chain can be synthesized on a solid 
support using standard phosphoramidite coupling chemistry 
10 and modified nucleoside analogs. Compounds such as 5'- 
(4-methoxytrityl)amino-5' -deoxy- thymidine phosphoramidite 
can be used as a link between the PNA and the 5' end of 
DNA (Mag et al . (1989) Nucleic Acid Res. 17:5973-88). 
PNA monomers are then coupled in a stepwise manner to 
is produce a chimeric molecule with a 5' PNA segment and a 
3' DNA segment (Finn et al. (1996) Nucleic Acids Res. 
24 (17) :3357-63) . Alternatively, chimeric molecules can 
be synthesized with a 5' DNA segment and a 3' PNA segment 
(Peterser et al. (1975) Bioorganic Med. Chem. Lett. 
20 5:1119-11124). 

In other embodiments, the oligonucleotide may 
include other appended groups such as peptides (e.g., for 
targeting host cell receptors in vivo) , or agents 
facilitating transport across the cell membrane {see, 
25 e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci. USA 
86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad. 
Sci. USA 84:648-652; PCT Publication No. WO 88/09810) or 
the blood-brain barrier (see, e.g., PCT Publication No. 
W0 89/10134) . In addition, oligonucleotides can be 
30 modified with hybridization-triggered cleavage agents 
(see, e.g., Krol et al. (1988) Bio/TechJiigues 6:958-976) 
or intercalating agents (see, e.g., Zon (1988) Pharm. 
Res. 5:539-549). To this end, the oligonucleotide may be 
conjugated to another molecule, e.g., a peptide, 
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hybridization triggered cross -linking agent, transport 
agent, hybridization-triggered cleavage agent, etc, 

XL: Isolate d Tanao-77 Proteins and Anti-Tango~77 

Antibodies 

One aspect of the invention pertains to isolated 
Tango-77 proteins, and biologically active portions 
thereof, as well as polypeptide fragments suitable for 
use as immunogens to raise anti-Tango-77 antibodies. In 
one embodiment, native Tango-77 proteins can be isolated 
from cells or tissue sources by an appropriate 
purification scheme using standard protein purification 
techniques. In another embodiment, Tango-77 proteins are 
produced by recombinant DNA techniques. Alternative to 
recombinant expression, a Tango-77 protein or polypeptide 
can be synthesized chemically using standard peptide 
synthesis techniques . 

An "isolated" or "purified" protein or 
biologically active portion thereof is substantially free 
of cellular material or other contaminating proteins from 
the cell or tissue source from which the Tango-77 protein 
is derived, or substantially free of chemical precursors 
or other chemicals when chemically synthesized. The 
language "substantially free of cellular material" 
includes preparations of Tango-77 protein in which the 
protein is separated from cellular components of the 
cells from which it is isolated or recombinantly 
produced. Thus, Tango-77 protein that is substantially 
free of cellular material includes preparations of 
Tango-77 protein having less than about 30%, 20%, 10%, or 
5% (by dry weight) of non-Tango-77 protein (also referred 
to herein as a "contaminating protein"). When the 
Tango-77 protein or biologically active portion thereof 
is recombinantly produced, it is also preferably 
substantially free of culture medium, i.e., culture 



WO 99/06426 



PCI7US98/16102 



- 28 - 

medium represents less than about 20%, 10%, or 5% of the 
volume of the protein preparation. When Tango- 77 protein 
is produced by chemical synthesis, it is preferably 
substantially free of chemical precursors or other 
5 chemicals, i.e., it is separated from chemical precursors 
or other chemicals which are involved in the synthesis of 
the protein. Accordingly such preparations of Tango- 77 
protein have less than about 30%, 20%, 10%, 5% (by dry 
weight) of chemical precursors or non-Tango-77 chemicals. 
10 Biologically active portions of a Tango-77 protein 

include peptides comprising amino acid sequences 
sufficiently identical to or derived from the amino acid 
sequence of the Tango-77 protein (e.g., the amino acid 
sequence shown in SEQ ID NO:2, SEQ ID NO: 5, SEQ ID NO: 7, 
15 SEQ ID NO: 9, SEQ ID NO: 11, or SEQ ID NO: 13), which 

include fewer amino acids than the full length Tango-77 
proteins, and exhibit at least one activity of a Tango-77 
protein. Typically, biologically active portions 
comprise a domain or motif with at least one activity of 
20 the Tango-77 protein. A biologically active portion of a 
Tango-77 protein can be a polypeptide which is, for 
example, 10, 25, 50, 100 or more amino acids in length. 

Moreover, other biologically active portions, in 
which other regions of the protein are deleted, can be 
25 prepared by recombinant techniques and evaluated for one 
or more of the functional activities of a native Tango-77 
protein. 

Preferred Tango-77 protein has the amino acid 
sequence shown of SEQ ID NO:2, SEQ ID NO: 5, SEQ ID NO: 7, 

30 SEQ ID NO: 9, SEQ ID NO: 11, or SEQ ID NO: 13. Other useful 
Tango-77 proteins are substantially identical to SEQ ID 
NO:2, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID 
NO: 11, or SEQ ID NO: 13 and retain the functional activity 
of the protein of SEQ ID NO: 2, SEQ ID NO: 5, SEQ ID NO: 7, 

35 SEQ ID NO: 9, SEQ ID NO: 11, or SEQ ID NO: 13 yet differ in 
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amino acid sequence due to natural allelic variation or 
mutagenesis. Accordingly, a useful Tango-77 protein is a 
protein which includes an amino acid sequence at least 
about 45%, preferably 55%, 65%, 75%, 85%, 95%, or 99% 
s identical to the amino acid sequence of SEQ ID NO: 2, SEQ 
ID NO:5, SEQ ID NO:7, SEQ ID N0:9, SEQ ID NOrll, or SEQ 
ID NO: 13 and retains the functional activity of the 
Tango-77 proteins of SEQ ID NO: 2, SEQ ID NO: 5, SEQ ID 
NO:7, SEQ ID NO:9, SEQ ID NO:ll, or SEQ ID NO:13. In a 

io preferred embodiment, the Tango-77 protein retains a 
functional activity of the Tango-77 protein of SEQ ID 
NO:2, SEQ ID NO:5, SEQ ID N0:7, SEQ ID NO:9, SEQ ID 
NO: 11, or SEQ ID NO: 13. 

To determine the percent identity of two amino 

is acid sequences or of two nucleic acids, the sequences are 
aligned for optimal comparison purposes (e.g., gaps can 
be introduced in the sequence of a first amino acid or 
nucleic acid sequence for optimal alignment with a second 
amino or nucleic acid sequence) . The amino acid residues 

20 or nucleotides at corresponding amino acid positions or 
nucleotide positions are then compared. When a position 
in the first sequence is occupied by the same amino acid 
residue or nucleotide as the corresponding position in 
the second sequence, then the molecules are identical at 

25 that position. The percent identity between the two 
sequences is a function of the number of identical 
positions shared by the sequences (i.e., % identity = # 
of identical positions/total # of positions, e.g., 
overlapping x 100) . Preferably, the two sequences are 

30 the same length. 

The determination of percent homology between two 
sequences can be accomplished using a mathematical 
algorithm. A preferred, non- limiting example of a 
mathematical algorithm utilized for the comparison of two 

35 sequences is the algorithm of Karlin and Altschul (1990) 
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Proc. Natl. Acad. Sci. USA 87:2264-2268, modified as in 
Karlin and Altschul (1993) Proc. Natl. Acad, Sci. USA 
90:5873-5877. Such an algorithm is incorporated into the 
NBLAST and XBLAST programs of Altschul, et al . (1990) 
5 J. Mol. Biol. 215:403-410. BLAST nucleotide searches can 
be performed with the NBLAST program, score = 100, 
wordlength = 12 to obtain nucleotide sequences homologous 
to Tango- 77 nucleic acid molecules of the invention. 
BLAST protein searches can be performed with the XBLAST 
10 program, score = 50, wordlength = 3 to obtain amino acid 
sequences homologous to Tango-77 protein molecules of the 
invention. To obtain gapped alignments for comparison 
purposes, Gapped BLAST can be utilized as described in 
Altschul et al. (1997) Nucleic Acids Res. 25:3389-3402. 
is When utilizing BLAST and Gapped BLAST programs, the 
default parameters of the respective programs (e.g., 
XBLAST and NBLAST) can be used. See 
http://www.ncbi.nlm.nih.gov. Another preferred, non- 
limiting example of a mathematical algorithm utilized for 
20 the comparison of sequences is the algorithm of Myers and 
Miller, CABIOS (1989) . Such an algorithm is incorporated 
into the ALIGN program (version 2.0) which is part of the 
GCG sequence alignment software package. When utilizing 
the ALIGN program for comparing amino acid sequences, a 
25 PAM120 weight residue table, a gap length penalty of 12, 
and a gap penalty of 4 can be used. 

The percent identity between two sequences can be 
determined using techniques similar to those described 
above, with or without allowing gaps. In calculating 
30 percent identity, only exact matches are counted. 

The invention also provides Tango-77 chimeric or 
fusion proteins. As used herein, a Tango-77 "chimeric 
protein" or "fusion protein" comprises a Tango-77 
polypeptide operably linked to a non-Tango-77 
35 polypeptide. A "Tango-77 polypeptide" refers to a 
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polypeptide having an amino acid sequence corresponding 
to Tango- 77 polypeptides, whereas a "non-Tango- 77 
polypeptide" refers to a polypeptide having an amino acid 
sequence corresponding to a protein which is not 
substantially identical to the Tango-77 protein, e.g., a 
protein which is different from the Tango-77 protein and 
which is derived from the same or a different organism. 
Within a Tango-77 fusion protein the Tango-77 polypeptide 
can correspond to all or a portion of a Tango-77 protein, 
preferably at least one biologically active portion of a 
Tango-77 protein. Within the fusion protein, the term 
"operably linked" is intended to indicate that the 
Tango-77 polypeptide and the non-Tango-77 polypeptide are 
fused in- frame to each other. The non-Tango-77 
polypeptide can be fused to the N-terminus or C-terminus 
of the Tango-77 polypeptide. 

One useful fusion protein is a GST-Tango-77 fusion 
protein in which the Tango-77 sequences are fused to the 
C-terminus of the GST sequences. Such fusion proteins 
can facilitate the purification of recombinant Tango-77. 

In another embodiment, the fusion protein is a 
Tango-77 protein containing a heterologous signal 
sequence at its N-terminus. For example, the native 
Tango-77 signal sequence (i.e., about amino acids 1 to 63 
of SEQ ID NO: 2; SEQ ID N0:4; or about amino acids 1 to 52 
of SEQ ID NO: 7; SEQ ID NO: 8; or about amino acids 1 to 21 
of SEQ ID NO: 11; SEQ ID NO: 12) can be removed arid replaced 
with a signal sequence from another protein. In certain 
host cells (e.g., mammalian host cells), expression 
and/ or secretion of Tango-77 can be increased through use 
of a heterologous signal sequence. For example, the gp67 
secretory sequence of the baculovirus envelope protein 
can be used as a heterologous signal sequence (AusuJbel et 
al., supra;. Other examples of eukaryotic heterologous 
signal sequences include the secretory sequences of 
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melittin and human placental alkaline phosphatase 
(Stratagene; La Jolla, California). In yet another 
example, useful prokaryotic heterologous signal sequences 
include the phoA secretory signal {Sambrook et al., 
5 supra) and the protein A secretory signal (Pharmacia 
Biotech; Piscataway, New Jersey) ♦ 

In yet another embodiment, the fusion protein is 
an Tango- 77 -immunoglobulin fusion protein in which all or 
part of Tango- 77 is fused to sequences derived from a 
10 member of the immunoglobulin protein family. The 

Tango- 77 -immunoglobulin fusion proteins of the invention 
can be incorporated into pharmaceutical compositions and 
administered to a subject to inhibit an interaction 
between a Tango- 77 ligand and a Tango- 77 receptor on the 
is surface of a cell, to thereby suppress Tango-77 -mediated 
signal transduction in vivo. The Tango-77-immunoglobulin 
fusion proteins can be used to affect the bioavailability 
of a Tango-77 cognate ligand. Inhibition of the Tango-77 
ligand/Tango-77 interaction may be useful therapeutically 
20 for both the treatment of inflammatory and autoimmune 
disorders. Moreover, the Tango-77 -immunoglobulin fusion 
proteins of the invention can be used as immunogens to 
produce anti-Tango-77 antibodies in a subject, to purify 
Tango-77 ligands and in screening assays to identify 
25 molecules which inhibit the interaction of Tango-77 with 
a Tango-77 receptor. 

Preferably, a Tango-77 chimeric or fusion protein 
of the invention is produced by standard recombinant DNA 
techniques. For example, DNA fragments coding for the 
30 different polypeptide sequences are ligated together in- 
frame in accordance with conventional techniques, for 
example by employing blunt-ended or stagger-ended termini 
for ligation, restriction enzyme digestion to provide for 
appropriate termini, filling- in of cohesive ends as 
35 appropriate, alkaline phosphatase treatment to avoid 
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undesirable joining, and enzymatic ligation. In another 
embodiment, the fusion gene can be synthesized by 
conventional techniques including automated DNA 
synthesizers. Alternatively, PCR amplification of gene 
fragments can be carried out using anchor primers which 
give rise to complementary overhangs between two 
consecutive gene fragments which can subsequently be 
annealed and reamplified to generate a chimeric gene 
sequence (see, e.g., Current Protocols in Molecular 
Biology, Ausubel et al. eds., John Wiley & Sons: 1992). 
Moreover, many expression vectors are commercially 
available that already encode a fusion moiety (e.g., a 
GST polypeptide) . An Tango- 77 -encoding nucleic acid can 
be cloned into such an expression vector such that the 
fusion moiety is linked in-frame to the Tango-77 protein. 

The present invention also pertains to variants of 
the Tango-77 proteins (i.e., proteins having a sequence 
which differs from that of the Tango-77 amino acid 
sequence) . Such variants can function as either Tango-77 
agonists (mimetics) or as Tango-77 antagonists. Variants 
of the Tango-77 protein can be generated by mutagenesis, 
e.g., discrete point mutation or truncation of the 
Tango-77 protein. An agonist of the Tango-77 protein can 
retain substantially the same, or a subset, of the 
biological activities of the naturally occurring form of 
the Tango-77 protein. An antagonist of the Tango-77 
protein can inhibit one or more of the activities of the 
naturally occurring form of the Tango-77 protein by, for 
example, competitively binding to a downstream or 
upstream member of a cellular signaling cascade which 
includes the Tango-77 protein. Thus, specific biological 
effects can be elicited by treatment with a variant of 
limited function. Treatment of a subject with a variant 
having a subset of the biological activities of the 
naturally occurring form of the protein can have fewer 
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side effects in a subject relative to treatment with the 
naturally occurring form of the Tango-77 proteins. 

Variants of the Tango-77 protein which function as 
either Tango-77 agonists (mimetics) or as Tango-77 
antagonists can be identified by screening combinatorial 
libraries of mutants, e.g., truncation mutants, of the 
Tango-77 protein for Tango-77 protein agonist or 
antagonist activity. In one embodiment, a variegated 
library of Tango-77 variants is generated by 
combinatorial mutagenesis at the nucleic acid level and 
is encoded by a variegated gene library. A variegated 
library of Tango-77 variants can be produced by, for 
example, enzymatically ligating a mixture of synthetic 
oligonucleotides into gene sequences such that a 
degenerate set of potential Tango-77 sequences is 
expressible as individual polypeptides, or alternatively, 
as a set of larger fusion proteins (e.g., for phage 
display) containing the set of Tango-77 sequences 
therein. There are a variety of methods which can be 
used to produce libraries of potential Tango-77 variants 
from a degenerate oligonucleotide sequence. Chemical 
synthesis of a degenerate gene sequence can be performed 
in an automatic DNA synthesizer, and the synthetic gene 
then ligated into an appropriate expression vector. Use 
of a degenerate set of genes allows for the provision, in 
one mixture, of all of the sequences encoding the desired 
set of potential Tango-77 sequences. Methods for 
synthesizing degenerate oligonucleotides are known in the 
art (see, e.g., Narang (1983) Tetrahedron 39:3; Itakura 
30 et al. (1984) Annu. Rev. Biochem. 53:323; Itakura et al. 
(1984) Science 198:1056; Ike et al. (1983) Nucleic Acid 

Res. 11:477) . 

In addition, libraries of fragments of the 
Tango-77 protein coding sequence can be used to generate 
35 a variegated population of Tango-77 fragments for 
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screening and subsequent selection of variants of a 
Tango- 77 protein. In one embodiment, a library of coding 
sequence fragments can be generated by treating a double 
stranded PCR fragment of a Tango-77 coding sequence with 
a nuclease under conditions wherein nicking occurs only 
about once per molecule, denaturing the double stranded 
DNA, renaturing the DNA to form double stranded DNA which 
can include sense/antisense pairs from different nicked 
products, removing single stranded portions from reformed 
duplexes by treatment with SI nuclease, and ligating the 
resulting fragment library into an expression vector. By 
this method, an expression library can be derived which 
encodes N- terminal and internal fragments of various 
sizes of the Tango-77 protein. 

Several techniques are known in the art for 
screening gene products of combinatorial libraries made 
by point mutations or truncation, and for screening cDNA 
libraries for gene products having a selected property. 
Such techniques are adaptable for rapid screening of the 
gene libraries generated by the combinatorial mutagenesis 
of Tango-77 proteins. The most widely used techniques, 
which are amenable to high through-put analysis, for 
screening large gene libraries typically include cloning 
the gene library into replicable expression vectors, 
transforming appropriate cells with the resulting library 
of vectors, and expressing the combinatorial genes under 
conditions in which detection of a desired activity 
facilitates isolation of the vector encoding the gene 
whose product was detected. Recursive ensemble 
mutagenesis (REM) , a technique which enhances the 
frequency of functional mutants in the libraries, can be 
used in combination with the screening assays to identify 
Tango-77 variants (Arkin and Yourvan (1992) Proc. Natl. 
Acad. Sci. USA 89:7811-7815; Delgrave et al . (1993) 
Protein Engineering 6(3) :327-331) . 
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An isolated Tango- 77 protein, or a portion or 
fragment thereof, can be used as an immunogen to generate 
antibodies that bind Tango- 77 using standard techniques 
for polyclonal and monoclonal antibody preparation. The 
s full-length Tango-77 protein can be used or, 

alternatively, the invention provides antigenic peptide 
fragments of Tango-77 for use as immunogens. The 
antigenic peptide of Tango-77 comprises at least 8 
(preferably 10, 15, 20, or 30) amino acid residues of the 
10 amino acid sequence shown in SEQ ID NO: 2, SEQ ID NO: 5, 
SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO: 11 or SEQ ID NO:13 
and encompasses an epitope of Tango-77 such that an 
antibody raised against the peptide forms a specific 
immune complex with Tango-77. 
15 a Tango-77 immunogen typically is used to prepare 

antibodies by immunizing a suitable subject (e.g., 
rabbit, goat, mouse or other mammal) with the immunogen. 
An appropriate immunogenic preparation can contain, for 
example, recombinantly expressed Tango-77 protein or a 
20 chemically synthesized Tango-77 polypeptide. The 
preparation can further include an adjuvant, such as 
Freund' s complete or incomplete adjuvant, or similar 
immunostimulatory agent. Immunization of a suitable 
subject with an immunogenic Tango-77 preparation induces 
25 a polyclonal anti-Tango-77 antibody response. 

Accordingly, another aspect of the invention 
pertains to anti-Tango-77 antibodies. The term 
"antibody" as used herein refers to immunoglobulin 
molecules and immunologically active portions of 
30 immunoglobulin molecules, i.e., molecules that contain an 
antigen binding site which specifically binds an antigen, 
such as Tango-77. A molecule which specifically binds to 
Tango-77 is a molecule which binds Tango-77, but does not 
substantially bind other molecules in a sample, e.g., a 
35 biological sample, which naturally contains Tango-77. 
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Examples of immunologically active portions of 
immunoglobulin molecules include F(ab) and F(ab') 2 
fragments which can be generated by treating the antibody 
with an enzyme such as pepsin. The invention provides 
5 polyclonal and monoclonal antibodies that bind Tango-77. 
The term "monoclonal antibody" or "monoclonal antibody 
composition" , as used herein, refers to a population of 
antibody molecules that contain only one species of an 
antigen binding site capable of immunoreacting with a 

10 particular epitope of Tango-77. A monoclonal antibody 
composition thus typically displays a single binding 
affinity for a particular Tango-77 protein with which it 
immunoreacts. 

Polyclonal anti -Tango- 77 antibodies can be 

is prepared as described above by immunizing a suitable 
subject with a Tango-77 immunogen. The anti -Tango- 77 
antibody titer in the immunized subject can be monitored 
over time by standard techniques, such as with an enzyme 
linked immunosorbent assay (ELISA) using immobilized 

20 Tango-77. If desired, the antibody molecules directed 
against Tango-77 can be isolated from the mammal (e.g., 
from the blood) and further purified by well-known 
techniques, such as protein A chromatography to obtain 
the IgG fraction. At an appropriate time after 

25 immunization, e.g., when the anti -Tango- 77 antibody 
titers are highest, antibody-producing cells can be 
obtained from the subject and used to prepare monoclonal 
antibodies by standard techniques, such as the hybridoma 
technique originally described by Kohler and Milstein 

30 (1975) Nature 256:495-497, the human B cell hybridoma 
technique (Kozbor et al. (1983) Immunol Today 4:72), the 
EBV-hybridoma technique (Cole et al. (1985), Monoclonal 
Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 
77-96) or trioma techniques. The technology for 

35 producing hybridomas is well known (see generally Current 
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Protocols in Immunology (1994) Coligan et al. (eds.) John 
Wiley & Sons, Inc., New York, NY). Briefly, an immortal 
cell line (typically a myeloma) is fused to lymphocytes 
(typically splenocytes) from a mammal immunized with a 
5 Tango-77 immunogen as described above, and the culture 
supernatants of the resulting hybridoma cells are 
screened to identify a hybridoma producing a monoclonal 
antibody that binds Tango-77. 

Any of the many well known protocols used for 
10 fusing lymphocytes and immortalized cell lines can be 
applied for the purpose of generating an anti -Tango- 77 
monoclonal antibody (see, e.g., Current Protocols in 
Immunology, supra; Galfre et al. (1977) Nature 266:55052; 
R.H. Kenneth, in Monoclonal Antibodies: A New Dimension 
is In Biological Analyses, Plenum Publishing Corp,, New 
York, New York (1980); and Lerner (1981) Yale J. Biol. 
Med., 54:387-402. Moreover, the ordinarily skilled 
worker will appreciate that there are many variations of 
such methods which also would be useful. Typically, the 
20 immortal cell line (e.g., a myeloma cell line) is derived 
from the same mammalian species as the lymphocytes- For 
example, murine hybridomas can be made by fusing 
lymphocytes from a mouse immunized with an immunogenic 
preparation of the present invention with an immortalized 
25 mouse cell line, e.g., a myeloma cell line that is 
sensitive to culture medium containing hypoxanthine, 
aminopterin and thymidine ("HAT medium") . Any of a 
number of myeloma cell lines can be used as a fusion 
partner according to standard techniques, e.g., the P3- 
30 NSl/l-Ag4-l, P3-x63-Ag8.653 or Sp2/0-Agl4 myeloma lines. 
These myeloma lines are available from ATCC. Typically, 
HAT-sensitive mouse myeloma cells are fused to mouse 
splenocytes using polyethylene glycol ("PEG"). Hybridoma 
cells resulting from the fusion are then selected using 
35 HAT medium, which kills unfused and unproductive ly fused 
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myeloma cells (unfused splenocytes die after several days 
because they are not transformed) . Hybridoma cells 
producing a monoclonal antibody of the invention are 
detected by screening the hybridoma culture supernatants 
for antibodies that bind Tango-77, e.g., using a standard 
ELISA assay. 

Alternative to preparing monoclonal antibody- 
secreting hybridomas, a monoclonal anti-Tango-77 antibody 
can be identified and isolated by screening a recombinant 
combinatorial immunoglobulin library (e.g., an antibody 
phage display library) with Tango-77 to thereby isolate 
immunoglobulin library members that bind Tango-77. Kits 
for generating and screening phage display libraries are 
commercially available (e.g., the Pharmacia Recombinant 
Phage Antibody System, Catalog No. 27-9400-01; and the 
Stratagene SurfZAP* Phage Display Kit, Catalog No. 
240612) . Additionally, examples of methods and reagents 
particularly amenable for use in generating and screening 
antibody display library can be found in, for example, 
U.S. Patent No. 5,223,409; PCT Publication No. WO 
92/18619; PCT Publication No. WO 91/17271; PCT 
Publication No. WO 92/20791; PCT Publication No. WO 
92/15679; PCT Publication No. WO 93/01288; PCT 
Publication No. WO 92/01047; PCT Publication No. WO 
92/09690; PCT Publication No. WO 90/02809; Fuchs et al. 
(1991) Bio/Technology 9:1370-1372; Hay et al . (1992) Hum. 
Antibod. Hybridomas 3:81-85; Huse et al. (1989) Science 
246:1275-1281; Griffiths et al. (1993) EMBO J 12:725-734. 

Additionally, recombinant anti-Tango-77 
antibodies, such as chimeric and humanized monoclonal 
antibodies, comprising both human and non-human portions, 
which can be made using standard recombinant DNA 
techniques, are within the scope of the invention. Such 
chimeric and humanized monoclonal antibodies can be 
produced by recombinant DNA techniques known in the art, 
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for example using methods described in PCT Publication 
No. WO 87/02671; European Patent Application 184,187; 
European Patent Application 171,496; European Patent 
Application 173,494; PCT Publication No. WO 86/01533; 
5 U.S. Patent No. 4,816,567; European Patent Application 
125,023; Better et al. (1988) Science 240:1041-1043; Liu 
et al. (1987) Proc. Natl. Acad. Sci. USA 84:3439-3443; 
Liu et al. (1987) J. Immunol. 139:3521-3526; Sun et al . 
(1987) Proc. Natl. Acad. Sci. USA 84:214-218; Nishimura 
10 et al. (1987) Cane. Res. 47:999-1005; Wood et al. (1985) 
Nature 314:446-449; and Shaw et al. (1988) J. Natl. 
Cancer Inst. 80:1553-1559); Morrison (1985) Science 
229:1202-1207; Oi et al . (1986) Bio/Techniques 4:214; 
U.S. Patent 5,225,539; Jones et al . (1986) Nature 
15 321:552-525; Verhoeyan et al. (1988) Science 239:1534; 
and Beidler et al. (1988) J. Immunol. 141:4053-4060. 

Completely human antibodies are particularly 
desirable for therapeutic treatment of human patients. 
Such antibodies can be produced using transgenic mice 
which are incapable of expressing endogenous 
immunoglobulin heavy and light chains genes, but which 
can express human heave and light chain genes. The 
transgenic mice are immunized in the normal fashion with 
• a selected antigen, e.g., all or a portion of Tango-77. 
25 Monoclonal antibodies directed against the antigen can be 
obtained using conventional hybridoma technology. The 
human immunoglobulin transgenes harbored by the 
transgenic mice rearrange during B cell differentiation, 
and subsequently undergo class switching and somatic 
30 mutation. Thus, using such a technique, it is possible 
to produce therapeutically useful IgG, IgA and IgE 
antibodies. For an overview of this technology for 
producing human antibodies, see Lonberg and Huszar (1995, 
int. Rev. Immunol. 13:65-93). For a detailed discussion 
35 of this technology for producing human antibodies and 
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human monoclonal antibodies and protocols for producing 
such antibodies, see, e.g., U.S. Patent 5,625,126; U.S. 
Patent 5,633,425; U.S. Patent 5,569,825; U.S. Patent 
5,661,016; and U.S. Patent 5,545,806. In addition, 
companies such as Abgenix, Inc. (Freemont, CA) , can be 
engaged to provide human antibodies directed against a 
selected antigen using technology similar to the 
described above. 

Completely human antibodies which recognize a 
selected epitope can be generated using a technique 
referred to as "guided selection." In this approach a 
selected non-human monoclonal antibody, e.g., a murine 
antibody, is used to guide the selection of a completely 
human antibody recognizing the same epitope. 

First, a non-human monoclonal antibody which binds 
a selected antigen (epitope), e.g., an antibody which 
inhibits Tango-77 activity, is identified. The heave 
chain and the light chain of the non- human antibody are 
cloned and used to create phage display Fab fragments. 
For example, the heave chain gene can be cloned into a 
plasmid vector so that the heavy chain can be secreted 
from bacteria. The light chain gene can be cloned into a 
phage coat protein gene so that the light chain can be 
expressed on the surface of phage. A repertoire (random 
collection) of human light chains fused to phage is used 
to infect the bacteria which express the non-human heavy 
chain. The resulting progeny phage display hybrid 
antibodies (human light chain/ non -human heavy chain) . 
The selected antigen is used in a panning screen to 
select phage which bind the selected antigen. Several 
rounds of selection may be required to identify such 
phage. Next, human light chain genes are isolated from 
the selected phage which bind the selected antigen. 
These selected human light chain genes are then used to 
guide the selection of human heavy chain genes as 
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follows. The selected human light chain genes are 
inserted into vectors for expression by bacteria. 
Bacteria expressing the selected human light chains are 
infected with a repertoire of human heavy chains fused to 

5 phage. The resulting progeny phage display human 
antibodies (human light chain/human heavy chain) . 

Next, the selected antigen is used in a panning 
screen to select phage which bind the selected antigen. 
The phage selected in this step display completely human 

10 antibody which recognize the same epitope recognized by 
the original selected, non-human monoclonal antibody. 
The genes encoding both the heavy and light chains are 
readily isolated and be further manipulated for 
production of human antibody. This technology is 

is described by Jespers et al. (1994, Bio/ technology 12:899- 
903) . 

An anti-Tango-77 antibody (e.g., monoclonal 
antibody) can be used to isolate Tango- 77 by standard 
techniques, such as affinity chromatography or 
20 immunoprecipitation. An anti-Tango-77 antibody can 
facilitate the purification of natural Tango-77 from 
cells and of recombinant ly produced Tango-77 expressed in 
host cells. Moreover, an anti-Tango-77 antibody can be 
used to detect Tango-77 protein (e.g., in a cellular 
25 lysate or cell supernatant) in order to evaluate the 
abundance and pattern of expression of the Tango-77 
protein. Anti-Tango-77 antibodies can be used 
diagnostically to monitor protein levels in tissue as 
part of a clinical testing procedure, e.g., to, for 

30 example, determine the efficacy of a given treatment 
regimen. Detection can be facilitated by coupling the 
antibody to a detectable substance. Examples of 
detectable substances include various enzymes, prosthetic 
groups, fluorescent materials, luminescent materials, 

35 bioluminescent materials, and radioactive materials. 
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Examples of suitable enzymes include horseradish 
peroxidase, alkaline phosphatase, 0-galactosidase, or 
acetylcholinesterase; examples of suitable prosthetic 
group complexes include streptavidin/biotin and 
s avidin/biotin; examples of suitable fluorescent materials 
include umbellif erone, fluorescein, fluorescein 
isothiocyanate, rhodamine, dichlorotriazinylamine 
fluorescein, dansyl chloride or phycoerythrin; an example 
of a luminescent material includes luminol; examples of 
10 bioluminescent materials include lucif erase, luciferin, 
and aequorin, and examples of suitable radioactive 
material include 125 I, m i, 35 s or 3 H. 

III. Reco mbinant Expression Vectors and Host Cells 
Another aspect of the invention pertains to 

is vectors, preferably expression vectors, containing a 
nucleic acid molecule encoding Tango-77 (or a portion 
thereof) . As used herein, the term "vector" refers to a 
nucleic acid molecule capable of transporting another 
nucleic acid to which it has been linked. One type of 

20 vector is a "plasmid" , which refers to a circular double 
stranded DNA loop into which additional DNA segments can 
be ligated. Another type of vector is a viral vector, 
wherein additional DNA segments can be ligated into the 
viral genome. Certain vectors are capable of autonomous 

25 replication in a host cell into which they are introduced 
(e.g., bacterial vectors having a bacterial origin of 
replication and episomal mammalian vectors) . Other 
vectors (e.g., non-episomal mammalian vectors) are 
integrated into the genome of a host cell upon 

30 introduction into the host cell, and thereby are 

replicated along with the host genome. Moreover, certain 
vectors, expression vectors, are capable of directing the 
expression of genes to which they are operably linked. 
In general, expression vectors of utility in recombinant 



WO 99/06426 



PCT/US98/16102 



- 44 - 

DNA techniques are often in the form of plasmids 
(vectors) . However, the invention is intended to include 
such other forms of expression vectors, such as viral 
vectors (e.g., replication defective retroviruses, 
5 adenoviruses and adeno-associated viruses) , which serve 
equivalent functions . 

The recombinant expression vectors of the 
invention comprise a nucleic acid of the invention in a 
form suitable for expression of the nucleic acid in a 
10 host cell, which means that the recombinant expression 
vectors include one or more regulatory sequences, 
selected on the basis of the host cells to be used for 
expression, which is operably linked to the nucleic acid 
sequence to be expressed. Within a recombinant 
is expression vector, "operably linked" is intended to mean 
that the nucleotide sequence of interest is linked to the 
regulatory sequence (s) in a manner which allows for 
expression of the nucleotide sequence (e.g., in an in 
vitro transcription/translation system or in a host cell 
o when the vector is introduced into the host cell) . The 
term "regulatory sequence" is intended to include 
promoters, enhancers and other expression control 
elements (e.g., polyadenylation signals) . Such 
regulatory sequences are described, for example, in 
25 Goeddel; Gene Expression Technology: Methods in 

Enzymology 185, Academic Press, San Diego, CA (1990). 
Regulatory sequences include those which direct 
constitutive expression of a nucleotide sequence in many 
types of host cell and those which direct expression of 
30 the nucleotide sequence only in certain host cells (e.g., 
tissue-specific regulatory sequences) . It will be 
appreciated by those skilled in the art that the design 
of the expression vector can depend on such factors as 
the choice of the host cell to be transformed, the level 
35 of expression of protein desired, etc. The expression 
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vectors of the invention can be introduced into host 
cells to thereby produce proteins or peptides, including 
fusion proteins or peptides, encoded by nucleic acids as 
described herein (e.g., Tango- 77 proteins, mutant forms 
of Tango-77, fusion proteins, etc.). 

The recombinant expression vectors of the 
invention can be designed for expression of Tango-77 in 
prokaryotic or eukaryotic cells, e.g., bacterial cells 
such as E. coli, insect cells (using baculovirus 
expression vectors), yeast cells or mammalian cells. 
Suitable host cells are discussed further in Goeddel, 
Gene Expression Technology: Methods in Enzymology 185, 
Academic Press, San Diego, CA (1990) . Alternatively, the 
recombinant expression vector can be transcribed and 
translated in vitro, for example using T7 promoter 
regulatory sequences and T7 polymerase. 

Expression of proteins in prokaryotes is most 
often carried out in E. coli with vectors containing 
constitutive or inducible promoters directing the 
expression of either fusion or non- fusion proteins. 
Fusion vectors add a number of amino acids to a protein 
encoded therein, usually to the amino terminus of the 
recombinant protein. Such fusion vectors typically serve 
three purposes: 1) to increase expression of recombinant 
protein; 2) to increase the solubility of the recombinant 
protein; and 3) to aid in the purification of the 
recombinant protein by acting as a ligand in affinity 
purification. Often, in fusion expression vectors, a 
proteolytic cleavage site is introduced at the junction 
of the fusion moiety and the recombinant protein to 
enable separation of the recombinant protein from the 
fusion moiety subsequent to purification of the fusion 
protein. Such enzymes, and their cognate recognition 
sequences, include Factor Xa, thrombin and enterokinase . 
Typical fusion expression vectors include pGEX (Pharmacia 
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Biotech Inc; Smith and Johnson (1988) Gene 67:31-40), 
pMAL (New England Biolabs, Beverly, MA) and pRIT5 
(Pharmacia, Piscataway, NJ) which fuse glutathione S- 
transferase (GST) , maltose E binding protein, or protein 

5 A, respectively, to the target recombinant protein. 

Examples of suitable inducible non- fusion E. coli 
expression vectors include pTrc (Amann et al. (1988) Gene 
69:301-315) and pET lid (Studier et al., Gene Expression 
Technology: Methods in Enzymology 185, Academic Press, 
10 San Diego, California (1990) 60-89) . Target gene 
expression from the pTrc vector relies on host RNA 
polymerase transcription from a hybrid trp-lac fusion 
promoter. Target gene expression from the pET lid vector 
relies on transcription from a T7 gnlO-lac fusion 

is promoter mediated by a coexpressed viral RNA polymerase 
(T7 gnl) . This viral polymerase is supplied by host 
strains BL21 (DE3) or HMS174(DE3) from a resident X 
prophage harboring a T7 gnl gene under the 
transcriptional control of the lacUV 5 promoter. 

20 One strategy to maximize recombinant protein 

expression in E. coli is to express the protein in a host 
bacteria with an impaired capacity to proteolytically 
cleave the recombinant protein (Gottesman, Gene 
Expression Technology: Methods in Enzymology 185, 

25 Academic Press, San Diego, California (1990) 119-128) . 
Another strategy is to alter the nucleic acid sequence of 
the nucleic acid to be inserted into an expression vector 
so that the individual codons for each amino acid are 
those preferentially utilized in E. coli (Wada et al. 

30 (1992) Nucleic Acids Res. 20:2111-2118). Such alteration 
of nucleic acid sequences of the invention can be carried 
out by standard DNA synthesis techniques. 

In another embodiment, the Tango-77 expression 
vector is a yeast expression vector. Examples of vectors 

35 for expression in yeast S. cerivisae include pYepSecl 
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(Baldari et al . (1987) EMBO J. 6:229-234), pMFa (Kurjan 
and Herskowitz (1982) Cell 30:933-943), pJRY88 (Schultz 
et al. (1987) Gene 54:113-123), pYES2 (Invitrogen 
Corporation, San Diego, CA) , and picZ (InVitrogen Corp, 
5 San Diego, CA) . 

Alternatively, Tango-77 can be expressed in insect 
cells using baculovirus expression vectors. Baculovirus 
vectors available for expression of proteins in cultured 
insect cells (e.g., Sf 9 cells) include the pAc series 
10 (Smith et al . (1983) Mol. Cell Biol. 3:2156-2165) and the 
pVL series (Lucklow and Summers (1989) Virology 170:31- 
39) . 

In yet another embodiment, a nucleic acid of the 
invention is expressed in mammalian cells using a 

is mammalian expression vector. Examples of mammalian 
expression vectors include pCDM8 (Seed (1987) Nature 
329:840) and pMT2PC (Kaufman et al . (1987) EMBO J. 6:187- 
195) . When used in mammalian cells, the expression 
vector's control functions are often provided by viral 

20 regulatory elements. For example, commonly used 
promoters are derived from polyoma, Adenovirus 2, 
cytomegalovirus and Simian Virus 40. For other suitable 
expression systems for both prokaryotic and eukaryotic 
cells see chapters 16 and 17 of Sambrook et al. (supra). 

25 In another embodiment, the recombinant mammalian 

expression vector is capable of directing expression of 
the nucleic acid preferentially in a particular cell type 
(e.g., tissue-specific regulatory elements are used to 
express the nucleic acid) . Tissue- specif ic regulatory 

30 elements are known in the art. Non-limiting examples of 
suitable tissue-specific promoters include the albumin 
promoter (liver-specific; Pinkert et al. (1987) Genes 
Dev. 1:268-277), lymphoid- specif ic promoters (Calame and 
Eaton (1988) Adv. Immunol. 43:235-275), in particular 

35 promoters of T cell receptors (Winoto and Baltimore 
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(1989) EMBO J. 8:729-733) and immunoglobulins (Banerji et 
al. (1983) Cell 33:729-740; Queen and Baltimore (1983) 
Cell 33:741-748), neuron- specific promoters (e.g., the 
neurofilament promoter; Byrne and Ruddle (1989) Proc. 
5 Natl. Acad- Sci. USA 86:5473-5477), pancreas -specif ic 
promoters (Edlund et al . (1985) Science 230 : 912-916) , and 
mammary gland-specific promoters (e.g., milk whey 
promoter; U.S. Patent No, 4,873,316 and European 
Application Publication No. 264,166). Development ally - 
10 regulated promoters are also encompassed, for example the 
murine hox promoters (Kessel and Gruss (1990) Science 
249:374-379) and the a-f etoprotein promoter (Campes and 
Tilghman (1989) Genes Dev. 3:537-546). 

The invention further provides a recombinant 
is expression vector comprising a DNA molecule of the 
invention cloned into the expression vector in an 
antisense orientation. That is, the DNA molecule is 
operably linked to a regulatory sequence in a manner 
which allows for expression (by transcription of the DNA 
20 molecule) of an RNA molecule which is antisense to 

Tango-77 mRNA. Regulatory sequences operably linked to a 
nucleic acid cloned in the antisense orientation can be 
chosen which direct the continuous expression of the 
antisense RNA molecule in a variety of cell types, for 
25 instance viral promoters and/ or enhancers, or regulatory 
sequences can be chosen which direct constitutive, tissue 
specific or cell type specific expression of antisense 
RNA. The antisense expression vector can be in the form 
of a recombinant plasmid, phagemid or attenuated virus in 
30 which antisense nucleic acids are produced under the 
control of a high efficiency regulatory region, the 
activity of which can be determined by the cell type into 
which the vector is introduced. For a discussion of the 
regulation of gene expression using antisense genes see 
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Weintraub et al . (Reviews - Trends in Genetics, Vol. 1(1) 
1986) . 

Another aspect of the invention pertains to host 
cells into which a recombinant expression vector of the 
5 invention has been introduced. The terms "host cell" and 
"recombinant host cell" are used interchangeably herein. 
It is understood that such terms refer not only to the 
particular subject cell but to the progeny or potential 
progeny of such a cell. Because certain modifications 
10 may occur in succeeding generations due to either 

mutation or environmental influences, such progeny may 
not, in fact, be identical to the parent cell, but are 
still included within the scope of the term as used 
herein. 

is A host cell can be any prokaryotic or eukaryotic 

cell. For example, Tango-77 protein can be expressed in 
bacterial cells such as E. coli, insect cells, yeast or 
mammalian cells (such as Chinese hamster ovary cells 
(CHO) or COS cells) . Other suitable host cells are known 

20 to those skilled in the art. 

Vector DNA can be introduced into prokaryotic or 
eukaryotic cells via conventional transformation or 
transfection techniques. As used herein, the terms 
"transformation" and "transfection" are intended to refer 

25 to a variety of art -recognized techniques for introducing 
foreign nucleic acid (e.g., DNA) into a host cell, 
including calcium phosphate or calcium chloride co- 
precipitation, DEAE-dextran-mediated transfection, 
lipofection, or electroporation. Suitable methods for 

30 transforming or transfecting host cells can be found in 
Sambrook, et al . (supra), and other laboratory manuals. 

For stable transfection of mammalian cells, it is 
known that, depending upon the expression vector and 
transfection technique used, only a small fraction of 

35 cells may integrate the foreign DNA into their genome. 
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In order to identify and select these integrants, a gene 
that encodes a selectable marker (e.g., for resistance to 
antibiotics) is generally introduced into the host cells 
along with the gene of interest. Preferred selectable 
5 markers include those which confer resistance to drugs, 
such as G418, hygromycin and methotrexate. Nucleic acid 
encoding a selectable marker can be introduced into a 
host cell on the same vector as that encoding Tango-77 or 
can be introduced on a separate vector. Cells stably 
.o transfected with the introduced nucleic acid can be 
identified by drug selection (e.g., cells that have 
incorporated the selectable marker gene will survive, 
while the other cells die) . 

A host cell of the invention, such as a 
L5 prokaryotic or eukaryotic host cell in culture, can be 
used to produce (i.e., express) Tango-77 protein. 
Accordingly, the invention further provides methods for 
producing Tango-77 protein using the host cells of the 
invention. In one embodiment, the method comprises 
20 culturing the host cell of invention (into which a 

recombinant expression vector encoding Tango-77 has been 
introduced) in a suitable medium such that Tango-77 
protein is produced. In another embodiment, the method 
further comprises isolating Tango-77 from the medium or 

25 the host ceil. 

The host cells of the invention can also be used 
to produce nonhuman transgenic animals. For example, in 
one embodiment, a host cell of the invention is a 
fertilized oocyte or an embryonic stem cell into which 

30 Tango- 77 -coding sequences have been introduced. Such 
host cells can then be used to create non-human 
transgenic animals in which exogenous Tango-77 sequences 
have been introduced into their genome or homologous 
recombinant animals in which endogenous Tango-77 

35 sequences have been altered. Such animals are useful for 
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studying the function and/or activity of Tango- 77 and for 
identifying and/or evaluating modulators of Tango- 77 
activity. As used herein, a "transgenic animal" is a 
non-human animal, preferably a mammal, more preferably a 
5 rodent such as a rat or mouse, in which one or more of 
the cells of the animal includes a transgene. Other 
examples of transgenic animals include non-human 
primates, sheep, dogs, cows, goats, chickens, amphibians, 
etc. A transgene is exogenous DNA which is integrated 

10 into the genome of a cell from which a transgenic animal 
develops and which remains in the genome of the mature 
animal, thereby directing the expression of an encoded 
gene product in one or more cell types or tissues of the 
transgenic animal. As used herein, an "homologous 

15 recombinant animal" is a non-human animal, preferably a 
mammal, more preferably a mouse, in which an endogenous 
Tango- 77 gene has been altered by homologous 
recombination between the endogenous gene and an 
exogenous DNA molecule introduced into a cell of the 

20 animal, e.g., an embryonic cell of the animal, prior to 
development of the animal. 

A transgenic animal of the invention can be 
created by introducing Tango- 77 -encoding nucleic acid 
into the male pronuclei of a fertilized oocyte, e.g., by 

25 microinjection, retroviral infection, and allowing the 
oocyte to develop in a pseudopregnant female foster 
animal. The Tango-77 cDNA sequence e.g., that of (SEQ ID 
NO:l, SEQ ID NO:3, SEQ ID NO:6; SEQ ID NO:10 or the cDNA 
of ATCC 98807) can be introduced as a transgene into the 

30 genome of a non-human animal. Alternatively, a nonhuman 
homologue of the human Tango-77 gene, such as a mouse 
Tango-77 gene, can be isolated based on hybridization to 
the human Tango-77 cDNA and used as a transgene. 
Intronic sequences and polyadenylation signals can also 

35 be included in the transgene to increase the efficiency 
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of expression of the transgene. A tissue-specific 
regulatory sequence (s) can be operably linked to the 
Tango- 77 transgene to direct expression of Tango- 77 
protein to particular cells. Methods for generating 
5 transgenic animals via embryo manipulation and 

microinjection, particularly animals such as mice, have 
become conventional in the art and are described, for 
example, in U.S. Patent Nos. 4,736,866 and 4,870,009, 
U.S. Patent No. 4,873,191 and in Hogan, Manipulating the 
10 Mouse Embryo (Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, N.Y., 1986). Similar methods are used for 
production of other transgenic animals. A transgenic 
founder animal can be identified based upon the presence 
of the Tango-77 transgene in its genome and/or expression 
is of Tango-77 mRNA in tissues or cells of the animals. A 
transgenic founder animal can then be used to breed 
additional animals carrying the transgene. Moreover, 
transgenic animals carrying a transgene encoding Tango-77 
can further be bred to other transgenic animals carrying 
20 other transgenes. 

To create an homologous recombinant animal, a 
vector is prepared which contains at least a portion of a 
Tango-77 gene (e.g., a human or a non-human homolog of 
the Tango-77 gene, e.g., a murine Tango-77 gene) into 
25 which a deletion, addition or substitution has been 

introduced to thereby alter, e.g., functionally disrupt, 
the Tango-77 gene. In a preferred embodiment, the vector 
is designed such that, upon homologous recombination, the 
endogenous Tango-77 gene is functionally disrupted (i.e., 
30 no longer encodes a functional protein; also referred to 
as a "knock out" vector) . Alternatively, the vector 

can be designed such that, upon homologous recombination, 
the endogenous Tango-77 gene is mutated or otherwise 
altered but still encodes functional protein (e.g., the 
35 upstream regulatory region can be altered to thereby 
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alter the expression of the endogenous Tango-77 protein) . 
In the homologous recombination vector, the altered 
portion of the Tango-77 gene is flanked at its 5' and 3' 
ends by additional nucleic acid of the Tango-77 gene to 
5 allow for homologous recombination to occur between the 
exogenous Tango-77 gene carried by the vector and an 
endogenous Tango-77 gene in an embryonic stem cell. The 
additional flanking Tango-77 nucleic acid is of 
sufficient length for successful homologous recombination 

10 with the endogenous gene. Typically, several kilobases 
of flanking DNA (both at the 5' and 3' ends) are included 
in the vector (see, e.g., Thomas and Capecchi (1987) Cell 
51:503 for a description of homologous recombination 
vectors) . The vector is introduced into an embryonic 

is stem cell line (e.g., by electroporation) and cells in 
which the introduced Tango-77 gene has homologously 
recombined with the endogenous Tango-77 gene are selected 
(see, e.g., Li et al. (1992) Cell 69:915). The selected 
cells are then injected into a blastocyst of an animal 

20 (e.g., a mouse) to form aggregation chimeras {see, e.g., 
Bradley in Teratocarcinomas and Embryonic Stem Cells: A 
Practical Approach, Robertson, ed. (IRL, Oxford, 1987) 
pp. 113-152) . A chimeric embryo can then be implanted 
into a suitable pseudopregnant female foster animal and 

25 the embryo brought to term. Progeny harboring the 

homologously recombined DNA in their germ cells can be 
used to breed animals in which all cells of the animal 
contain the homologously recombined DNA by germline 
transmission of the transgene. Methods for constructing 

30 homologous recombination vectors and homologous 

recombinant animals are described further in Bradley 
(1991) Current Opinion in Bio/Technology 2:823-829 and in 
PCT Publication Nos. WO 90/11354, WO 91/01140, WO 
92/0968, and WO 93/04169. 
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In another embodiment, transgenic non-human 
animals can be produced which contain selected systems 
which allow for regulated expression of the transgene. 
One example of such a system is the cre/loxP recombinase 
5 system of bacteriophage PI. For a description of the 
cre/loxP recombinase system, see, e.g./ Lakso et al . 
(1992) Proc. Natl. Acad. Sci. USA 89:6232-6236. Another 
example of a recombinase system is the FLP recombinase 
system of Saccharomyces cerevisiae (0' Gorman et al. 
10 (1991) Science 251:1351-1355. If a cre/loxP recombinase 
system is used to regulate expression of the transgene, 
animals containing transgenes encoding both the Cre 
recombinase and a selected protein are required. Such 
animals can be provided through the construction of 
is "double" transgenic animals, e.g., by mating two 

transgenic animals, one containing a transgene encoding a 
selected protein and the other containing a transgene 
encoding a recombinase. 

Clones of the non-human transgenic animals 
20 described herein can also be produced according to the 
methods described in Wilmut et al. (1997) Nature 385:810- 
813 and PCT Publication Nos. WO 97/07668 and WO 97/07669. 
In brief, a cell, e.g., a somatic cell, from the 
transgenic animal can be isolated and induced to exit the 
25 growth cycle and enter G 0 phase. The quiescent cell can 
then be fused, e.g., through the use of electrical 
pulses, to an enucleated oocyte from an animal of the 
same species from which the quiescent cell is isolated. 
The reconstructed oocyte is then cultured such that it 
30 develops to morula or blastocyte and then transferred to 
pseudopregnant female foster animal. The offspring borne 
of this female foster animal will be a clone of the 
animal from which the cell, e.g., the somatic cell, is 
isolated. 
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IV. Ph armaceutical Compositions 

The Tango- 77 nucleic acid molecules, Tango- 77 
proteins, and anti-Tango-77 antibodies (also referred to 
herein as "active compounds") of the invention can be 
incorporated into pharmaceutical compositions suitable 
for administration. Such compositions typically comprise 
the nucleic acid molecule, protein, or antibody and a 
pharmaceutically acceptable carrier. As used herein the 
language "pharmaceutically acceptable carrier" is 
intended to include any and all solvents, dispersion 
media, coatings, antibacterial and antifungal agents, 
isotonic and absorption delaying agents, and the like, 
compatible with pharmaceutical administration. The use 
of such media and agents for pharmaceutically active 
substances is well known in the art. Except insofar as 
any conventional media or agent is incompatible with the 
active compound, use thereof in the compositions is 
contemplated. Supplementary active compounds can also be 
incorporated into the compositions. 

A pharmaceutical composition of the invention is 
formulated to be compatible with its intended route of 
administration. Examples of routes of administration 
include parenteral, (e.g. intravenous, intradermal, 
subcutaneous) (e.g., oral inhalation), transdermal 
(topical), transmucosal, and rectal administration. 
Solutions or suspensions used for parenteral, 
intradermal, or subcutaneous application can include the 
following components: a sterile diluent such as water for 
injection, saline solution, fixed oils, polyethylene 
glycols, glycerine, propylene glycol or other synthetic 
solvents; antibacterial agents such as benzyl alcohol or 
methyl parabens; antioxidants such as ascorbic acid or 
sodium bisulfite; chelating agents such as 
ethylenediaminetetraacetic acid; buffers such as 
acetates, citrates or phosphates and agents for the 
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adjustment of tonicity such as sodium chloride or 
dextrose- pH can be adjusted with acids or bases, such 
as hydrochloric acid or. sodium hydroxide. The parenteral 
preparation can be enclosed in ampoules, disposable 
5 syringes or multiple dose vials made of glass or plastic. 
Pharmaceutical compositions suitable for 
injectable use include sterile aqueous solutions (where 
water soluble) or dispersions and sterile powders for the 
extemporaneous preparation of sterile injectable 
10 solutions or dispersions. For intravenous 

administration, suitable carriers include physiological 
saline, bacteriostatic water, Cremophor EL™ (BASF; 
Parsippany, NJ) or phosphate buffered saline (PBS) . In 
all cases, the composition must be sterile and should be 
is fluid to the extent that easy syringability exists. It 
must be stable under the conditions of manufacture and 
storage and must be preserved against the contaminating 
action of microorganisms such as bacteria and fungi. The 
carrier can be a solvent or dispersion medium containing, 
20 for example, water, ethanol, polyol (for example, 
glycerol, propylene glycol, and liquid polyetheylene 
glycol, and the like), and suitable mixtures thereof. 
The proper fluidity can be maintained, for example, by 
the use of a coating such as lecithin, by the maintenance 
25 of the required particle size in the case of dispersion 
and by the use of surfactants. Prevention of the action 
of microorganisms can be achieved by various 
antibacterial and antifungal agents, for example, 
parabens, chlorobutanol , phenol, ascorbic acid, 
30 thimerosal, and the like- In many cases, it will be 
preferable to include isotonic agents, for example, 
sugars, polyalcohols such as mannitol, sorbitol, sodium 
chloride in the composition. Prolonged absorption of the 
injectable compositions can be brought about by including 
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in the composition an agent which delays absorption, for 
example, aluminum monostearate and gelatin. 

Sterile injectable solutions can be prepared by 
incorporating the active compound (e.g., a Tango-77 
5 protein or ant i -Tango-77 antibody) in the required amount 
in an appropriate solvent with one or a combination of 
ingredients enumerated above, as required, followed by 
filtered sterilization. Generally, dispersions are 
prepared by incorporating the active compound into a 

10 sterile vehicle which contains a basic dispersion medium 
and the required other ingredients from those enumerated 
above. In the case of sterile powders for the 
preparation of sterile injectable solutions, the 
preferred methods of preparation are vacuum drying and 

is freeze-drying which yields a powder of the active 

ingredient plus any additional desired ingredient from a 
previously sterile-filtered solution thereof. 

Oral compositions generally include an inert 
diluent or an edible carrier. They can be enclosed in 

20 gelatin capsules or compressed into tablets. For the 
purpose of oral therapeutic administration, the active 
compound can be incorporated with excipients and used in 
the form of tablets, troches, or capsules. Oral 
compositions can also be prepared using a fluid carrier 

25 for use as a mouthwash, wherein the compound in the fluid 
carrier is applied orally and swished and expectorated or 
swallowed. Pharmaceutically compatible binding agents, 
and/or adjuvant materials can be included as part of the 
composition. The tablets, pills, capsules, troches and 

30 the like can contain any of the following ingredients, or 
compounds of a similar nature: a binder such as 
microcrystalline cellulose, gum tragacanth or gelatin; an 
excipient such as starch or lactose, a disintegrating 
agent such as alginic acid, Primogel, or corn starch; a 

35 lubricant such as magnesium stearate or Sterotes; a 
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glidant such as colloidal silicon dioxide; a sweetening 
agent such as sucrose or saccharin; or a flavoring agent 
such as peppermint, methyl salicylate, or orange 
flavoring. 

5 For administration by inhalation, the compounds 

are delivered in the form of an aerosol spray from a 
pressurized container or dispenser which contains a 
suitable propellant, e.g., a gas such as carbon dioxide, 
or a nebulizer. 
10 Systemic administration can also be by 

transmucosal or transdermal means. For transmucosal or 
transdermal administration, penetrants appropriate to the 
barrier to be permeated are used in the formulation. 
Such penetrants are generally known in the art, and 
is include, for example, for transmucosal administration, 
detergents, bile salts, and fusidic acid derivatives. 
Transmucosal administration can be accomplished through 
the use of nasal sprays or suppositories. For 
transdermal administration, the active compounds are 
20 formulated into ointments, salves, gels, or creams as 
generally known in the art. 

The compounds can also be prepared in the form of 
suppositories (e.g., with conventional suppository bases 
such as cocoa butter and other glycerides) or retention 
25 enemas for rectal delivery. 

in one embodiment, the active compounds are 
prepared with carriers that will protect the compound 
against rapid elimination from the body, such as a 
controlled release formulation, including implants and 
30 microencapsulated delivery systems. Biodegradable, 
biocompatible polymers can be used, such as ethylene 
vinyl acetate, polyanhydrides, polyglycolic acid, 
collagen, polyorthoesters, and polylactic acid. Methods 
for preparation of such formulations will be apparent to 
35 those skilled in the art. The materials can also be 
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obtained commercially from Alza Corporation and Nova 
Pharmaceuticals, Inc. Liposomal suspensions (including 
liposomes targeted to infected cells with monoclonal 
antibodies to viral antigens) can also be used as 
5 pharmaceutical^ acceptable carriers. These can be 
prepared according to methods known to those skilled in 
the art, for example, as described in U.S. Patent No. 
4,522,811. 

It is especially advantageous to formulate oral or 

10 parenteral compositions in dosage unit form for ease of 
administration and uniformity of dosage. Dosage unit 
form as used herein refers to physically discrete units 
suited as unitary dosages for the subject to be treated; 
each unit containing a predetermined quantity of active 

is compound calculated to produce the desired therapeutic 
effect in association with the required pharmaceutical 
carrier. The specification for the dosage unit forms of 
the invention are dictated by and directly dependent on 
the unique characteristics of the active compound and the 

20 particular therapeutic effect to be achieved, and the 
limitations inherent in the art of compounding such an 
active compound for the treatment of individuals. 

The nucleic acid molecules of the invention can be 
inserted into vectors and used as gene therapy vectors. 

25 Gene therapy vectors can be delivered to a subject by, 
for example, intravenous injection, local administration 
(U.S. Patent 5,328,470) or by stereotactic injection 
(see, e.g., Chen et al . (1994) Proc. Natl. Acad. Sci. USA 
91:3054-3057). The pharmaceutical preparation of the 

30 gene therapy vector can include the gene therapy vector 
in an acceptable diluent, or can comprise a slow release 
matrix in which the gene delivery vehicle is imbedded. 
Alternatively, where the complete gene delivery vector 
can be produced intact from recombinant cells, e.g. 

35 retroviral vectors, the pharmaceutical preparation can 
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include one or more cells which produce the gene delivery 
system. 

The pharmaceutical compositions can be included in 
a container, pack, or dispenser together with 
5 instructions for administration. 

V. Uses and Methods of the Invention 

The nucleic acid molecules, proteins, protein 
homologues, and antibodies described herein can be used 
in one or more of the following methods: a) screening 
10 assays; b) detection assays (e.g., chromosomal mapping, 
tissue typing, forensic biology) ; c) predictive medicine 
(e.g., diagnostic assays, prognostic assays, monitoring 
clinical trials, and pharmacogenomics) ; and d) methods of 
treatment (e.g., therapeutic and prophylactic). A 
is Tango- 77 protein interacts with other cellular proteins 
and can thus be used for regulation of inflammation. The 
polypeptides of the invention can be used in assays to 
determine biological activity. For example, they could 
be used in a panel of proteins for high- throughput 
20 screening. 

The isolated nucleic acid molecules of the 
invention can be used to express Tango-77 protein (e.g., 
via a recombinant expression vector in a host cell in 
gene therapy applications) , to detect Tango-77 mRNA 
25 (e.g., in a biological sample) or a genetic lesion in a 
Tango-77 gene, and to modulate Tango-77 activity. In 
addition, the Tango-77 proteins can be used to screen 
drugs or compounds which modulate the Tango-77 activity 
or expression as well as to treat disorders characterized 
30 by insufficient or excessive production of Tango-77 
protein or production of Tango-77 protein forms which 
have decreased or aberrant activity compared to Tango-77 
wild type protein. In addition, the anti-Tango-77 
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antibodies of the invention can be used to detect and 
isolate Tango-77 proteins and modulate Tango-77 activity. 

This invention further pertains to novel agents 
identified by the above -described screening assays and 
uses thereof for treatments as described herein. 

A - Screening Assays 

The invention provides a method (also referred to 
herein as a "screening assay") for identifying 
modulators, i.e., candidate or test compounds or agents 
(e.g., peptides, peptidomimetics, small molecules or 
other drugs) which bind to Tango-77 proteins or have a 
stimulatory or inhibitory effect on, for example, 
Tango-77 expression or Tango-77 activity. 

Examples of methods for the synthesis of molecular 
libraries can be found in the art, for example in: 
DeWitt et al. (1993) Proc. Natl. Acad. Sci. USA 90:6909; 
Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422; 
Zuckermann et al . (1994). J. Med. Chan. 37:2678; Cho et 
al. (1993) Science 261:1303; Carrell et al . (1994) Angew. 
Chem. Int. Ed. Engl. 33:2059; Carell et al . (1994) Angew. 
Chem. Int. Ed. Engl. 33:2061; and Gallop et al . (1994) j. 
Med. Chem. 37:1233. 

Libraries of compounds may be presented in 
solution (e.g., Houghten (1992) Bio/Techniques 13:412- 
421), or on beads (Lam (1991) Nature 354:82-84), chips 
(Fodor (1993) Nature 364:555-556) , bacteria (U.S. Patent 
No. 5,223,409), spores (Patent Nos. 5,571,698; 5,403,484; 
and 5,223,409), plasmids (Cull et al . (1992) Proc. Natl. 
Acad. Sci. USA 89:1865-1869) or phage (Scott and Smith 
(1990) Science 249:386-390; Devlin (1990) Science 
249:404-406; Cwirla et al . (1990) Proc. Natl. Acad. Sci. 
USA 87:6378-6382; and Felici (1991) J. Mol . Biol. 
222 :301-310) . 
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In another embodiment, an assay is used to 
determine the ability of the test compound to modulate 
the activity of Tango-77 or a biologically active portion 
thereof, for example, by determining the ability of the 
5 Tango-77 protein to bind to or interact with a Tango-77 
target molecule. As used herein, a "target molecule" is 
a molecule with which a Tango-77 protein binds or 
interacts in nature, for example, a molecule on the 
surface of a cell. A Tango-77 target molecule can be a 
10 non-Tango-77 molecule or a Tango-77 protein or 

polypeptide of the present invention. In one embodiment, 
a Tango-77 target molecule is a component of a signal 
transduction pathway, for example, Tango-77 may bind to a 
IL-1 receptor or another receptor thereby blocking the 
is receptor and inhibiting future signal transduction. 

Determining the ability of the Tango-77 protein to bind 
to or interact with a Tango-77 target molecule can be 
accomplished by one of the methods described above. In a 
preferred embodiment, determining the ability of the 
20 Tango-77 protein to bind to or interact with a Tango-77 
target molecule can be accomplished by determining the 
activity of the target molecule. For example, the 
activity of the target molecule can be determined by 
detecting induction of a cellular second messenger of the 
25 target (e.g., intracellular Ca 2 *, diacylglycerol , IP3, 
etc.), detecting catalytic/enzymatic activity of the 
target on an appropriate substrate, detecting the 
induction of a reporter gene (e.g., a Tango- 77 -responsive 
regulatory element operably linked to a nucleic acid 
30 encoding a detectable marker, e.g. lucif erase) , or 

detecting a cellular response, for example, inflammation. 

In yet another embodiment, an assay of the present 
invention is a cell-free assay comprising contacting a 
Tango-77 protein or biologically active portion thereof 
35 with a test compound and determining the ability of the 
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test compound to bind to the Tango- 77 protein or 
biologically active portion thereof. Binding of the test 
compound to the Tango- 77 protein can be determined either 
directly or indirectly as described above. In a 
5 preferred embodiment, the assay includes contacting the 
Tango- 77 protein or biologically active portion thereof 
with a known compound which binds Tango- 7 7 to form an 
assay mixture, contacting the assay mixture with a test 
compound, and determining the ability of the test 

10 compound to interact with a Tango- 77 protein, wherein 
determining the ability of the test compound to interact 
with a Tango- 77 protein comprises determining the ability 
of the test compound to preferentially bind to Tango-77 
or biologically active portion thereof as compared to the 

is known compound. 

In another embodiment, an assay is a cell -free 
assay comprising contacting Tango-77 protein or 
biologically active portion thereof with a test compound 
and determining the ability of the test compound to 

20 modulate (e.g., stimulate or inhibit) the activity of the 
Tango-77 protein or biologically active portion thereof. 
Determining the ability of the test compound to modulate 
the activity of Tango-77 can be accomplished, for 
example, by determining the ability of the Tango-77 

25 protein to bind to a Tango-77 target molecule by one of 
the methods described above for determining direct 
binding. In an alternative embodiment, determining the 
ability of the test compound to modulate the activity of 
Tango-77 can be accomplished by determining the ability 

30 of the Tango-77 protein to further modulate a Tango-77 
target molecule. For example, the catalytic/enzymatic 
activity of the target molecule on an appropriate 
substrate can be determined as previously described. 

In yet another embodiment, the cell -free assay 

35 comprises contacting the Tango-77 protein or biologically 
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active portion thereof with a known compound which binds 
Tango-77 to form an assay mixture, contacting the assay 
mixture with a test compound, and determining the ability 
of the test compound to interact with a Tango-77 protein, 
5 wherein determining the ability of the test compound to 
interact with a Tango-77 protein comprises determining 
the ability of the Tango-77 protein to preferentially 
bind to or modulate the activity of a Tango-77 target 
molecule . 

10 It is possible that membrane -bound forms of Tango- 

77 exist. The cell-free assays of the present invention 
are amenable to use of both the forms Tango-77. In the 
case of cell -free assays comprising a membrane -bound form 
of Tango-77, it may be desirable to utilize a 
is solubilizing agent such that the membrane -bound form of 
Tango-77 is maintained in solution. Examples of such 
solubilizing agents include non- ionic detergents such as 
n-octylglucoside, n-dodecylglucoside, n-dodecylmaltoside, 
octanoyl -N-methylglucamide , decanoyl -N-methylglucamide , 
20 Triton® X-100, Triton® X-114, Thesit®, 

Isotridecypoly (ethylene glycol ether)n, 3- [(3- 
cholamidopropyDdimethylamminio] -1-propane sulfonate 
(CHAPS) , 3- [ (3-cholamidopropyl)dimethylamminio] -2- 
hydroxy- 1-propane sulfonate (CHAPSO) , or N-dodecyl=N,N- 
25 dimethyl- 3 -ammonio- 1-propane sulfonate. 

In more than one embodiment of the above assay 
methods of the present invention, it may be desirable to 
immobilize either Tango-77 or its target molecule to 
facilitate separation of complexed from uncomplexed forms 
30 of one or both of the proteins, as well as to accommodate 
automation of the assay. Binding of a test compound to 
Tango-77, or interaction of Tango-77 with a target 
molecule in the presence and absence of a candidate 
compound, can be accomplished in any vessel suitable for 
containing the reactants. Examples of such vessels 



35 
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include microtitre plates, test tubes, and micro- 
centrifuge tubes. In one embodiment, a fusion protein 
can be provided which adds a domain that allows one or 
both of the proteins to be bound to a matrix. For 
5 example, glutathione-S-transf erase/ Tango-77 fusion 
proteins or glutathione-S-transf erase/target fusion 
proteins can be adsorbed onto glutathione sepharose beads 
(Sigma Chemical Co.; St. Louis, MO) or glutathione 
derivatized microtitre plates, which are then combined 

10 with the test compound or the test compound and either 
the non- adsorbed target protein or Tango- 77 protein, and 
the mixture incubated under conditions conducive to 
complex formation (e.g., at physiological conditions for 
salt and pH) . Following incubation, the beads or 

is microtitre plate wells are washed to remove any unbound 
components and complex formation is measured either 
directly or indirectly, for example, as described above. 
Alternatively, the complexes can be dissociated from the 
matrix, and the level of Tango-77 binding or activity 

20 determined using standard techniques. 

Other techniques for immobilizing proteins on 
matrices can also be used in the screening assays of the 
invention. For example, either Tango-77 or its target 
molecule can be immobilized utilizing conjugation of 

25 biotin and streptavidin. Biotinylated Tango-77 or target 
molecules can be prepared from biotin-NHS (N-hydroxy- 
succinimide) using techniques well known in the art 
(e.g., biotinylation kit, Pierce Chemicals; Rockford, 
IL) , and immobilized in the wells of streptavidin- coated 

30 96 well plates (Pierce Chemical) . Alternatively, 

antibodies reactive with Tango-77 or target molecules but 
which do not interfere with binding of the Tango-77 
protein to its target molecule can be derivatized to the 
wells of the plate, and unbound target or Tango-77 

35 trapped in the wells by antibody conjugation. Methods 
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for detecting such complexes, in addition to those 
described above for the GST -immobilized complexes, 
include immunodetection of complexes using antibodies 
reactive with the Tango- 77 or target molecule, as well as 
s enzyme-linked assays which rely on detecting an enzymatic 
activity associated with the Tango-77 or target molecule. 

In another embodiment, modulators of Tango-77 
expression are identified in a method in which a cell is 
contacted with a candidate compound and the expression of 
10 Tango-77 mRNA or protein in the cell is determined. The 
level of expression of Tango-77 mRNA or protein in the 
presence of the candidate compound is compared to the 
level of expression of Tango-77 mRNA or protein in the 
absence of the candidate compound. The candidate 
is compound can then be identified as a modulator of 
Tango-77 expression based on this comparison. For 
example, when expression of Tango-77 mRNA or protein is 
greater (statistically significantly greater) in the 
presence of the candidate compound than in its absence, 
20 the candidate compound is identified as a stimulator of 
Tango-77 mRNA or protein expression. Alternatively, when 
expression of Tango-77 mRNA or protein is less 
(statistically significantly less) in the presence of the 
candidate compound than in its absence, the candidate 
25 compound is identified as an inhibitor of Tango-77 mRNA 
or protein expression. The level of Tango-77 mRNA or 
protein expression in the cells can be determined by 
methods described herein for detecting Tango-77 mRNA or 
protein. 

3 0 in yet another aspect of the invention, the 

Tango-77 proteins can be used as "bait proteins" in a 
two-hybrid assay or three hybrid assay (see, e.g., U.S. 
Patent No. 5,283,317; Zervos et al- (1993) Cell 72:223- 
232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; 

35 Bartel et al . (1993) Bio/Techniques 14:920-924; Iwabuchi 
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et al. (1993) Oncogene 8:1693-1696; and PCT Publication 
No. WO 94/10300), to identify other proteins, which bind 
to or interact with Tango-77 ("Tango- 77-binding proteins" 
or "Tango-77-bp") and modulate Tango-77 activity. Such 
5 Tango- 77 -binding proteins are also likely to be involved 
in the propagation of signals by the Tango-77 proteins 
as, for example, upstream or downstream elements of the 
Tango-77 pathway. 

The two-hybrid system is based on the modular 

10 nature of most transcription factors, which consist of 
separable DNA-binding and activation domains. Briefly, 
the assay utilizes two different DNA constructs. In one 
construct, the gene that codes for Tango-77 is fused to a 
gene encoding the DNA binding domain of a known 

is transcription factor (e.g., GAL-4) . In the other 
construct, a DNA sequence, from a library of DNA 
sequences, that encodes an unidentified protein ("prey" 
or "sample") is fused to a gene that codes for the 
activation domain of the known transcription factor. If 

20 the "bait" and the "prey" proteins are able to interact, 
in vivo, forming an Tango -77 -dependent complex, the DNA- 
binding and activation domains of the transcription 
factor are brought into close proximity. This proximity 
allows transcription of a reporter gene (e.g., LacZ) 

25 which is operably linked to a transcriptional regulatory 
site responsive to the transcription factor. Expression 
of the reporter gene can be detected and cell colonies 
containing the functional transcription factor can be 
isolated and used to obtain the cloned gene which encodes 

30 the protein which interacts with Tango-77. 

This invention further pertains to novel agents 
identified by the above -described screening assays and 
uses thereof for treatments as described herein. 
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B. Detection Assays 

Portions or fragments of the cDNA sequence 
identified herein (and the corresponding complete gene 
sequences) can be used in numerous ways as polynucleotide 

5 reagents. For example, the sequence can be used to: (i) 
map the respective gene on a chromosome and, thus, locate 
gene regions associated with genetic disease; (ii) 
identify an individual from a minute biological sample 
(tissue typing); and (iii) aid in forensic identification 

10 of a biological sample. These applications are described 
in the subsections below. 

l . Chromosome Mapping 

Once the sequence (or a portion of the sequence) 
of a gene has been isolated, this sequence can be used to 
is map the location of the gene on a chromosome. 

Accordingly, Tango-77 nucleic acid molecules described 
herein or fragments thereof, can be used to map the 
location of the Tango-77 gene(s) on a chromosome. The 
mapping of the Tango-77 sequences to chromosomes is an 
20 important first step in correlating these sequences with 
genes associated with disease. 

Briefly, a Tango-77 gene can be mapped to 
chromosomes by preparing PCR primers (preferably 15-25 bp 
in length) from the Tango-77 sequences. Computer 
25 analysis of Tango-77 sequences can be used to rapidly 

select primers that do not span more than one exon in the 
genomic DNA, thus complicating the amplification process. 
These primers can then be used for PCR screening of 
somatic cell hybrids containing individual human 
30 chromosomes. Only those hybrids containing the human 
gene corresponding to the Tango-77 sequences will yield 
an amplified fragment. 

Somatic cell hybrids are prepared by fusing 
somatic cells from different mammals (e.g., human and 
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mouse cells) . As hybrids of human and mouse cells grow 
and divide, they gradually lose human chromosomes in 
random order, but retain the mouse chromosomes. By using 
media in which mouse cells cannot grow (because they lack 
5 a particular enzyme) but in which human cells can, the 
one human chromosome that contains the gene encoding the 
needed enzyme, will be retained. By using various media, 
panels of hybrid cell lines can be established. Each 
cell line in a panel contains either a single human 

10 chromosome or a small number of human chromosomes, and a 
full set of mouse chromosomes, allowing easy mapping of 
individual genes to specific human chromosomes. 
(D'Eustachio et al . (1983) Science 220:919-924). Somatic 
cell hybrids containing only fragments of human 

is chromosomes can also be produced by using human 
chromosomes with translocations and deletions. 

PCR mapping of somatic cell hybrids is a rapid 
procedure for assigning a particular sequence to a 
particular chromosome. Three or more sequences can be 

20 assigned per day using a single thermal cycler. Using 
the Tango- 77 sequences to design oligonucleotide primers, 
sublocalization can be achieved with panels of fragments 
from specific chromosomes. Other mapping strategies 
which can similarly be used to map a Tango- 77 sequence to 

25 its chromosome include in situ hybridization (described 
in Fan et al . (1990) Proc. Natl. Acad. Sci. USA 87:6223- 
27), pre-screening with labeled flow-sorted chromosomes, 
and pre-selection by hybridization to chromosome specific 
cDNA libraries. 

30 Fluorescence in situ hybridization (FISH) of a DNA 

sequence to a metaphase chromosomal spread can further be 
used to provide a precise chromosomal location in one 
step. Chromosome spreads can be made using cells whose 
division has been blocked in metaphase by a chemical, 

35 e.g., colcemid that disrupts the mitotic spindle. The 
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chromosomes can be treated briefly with trypsin, and then 
stained with Giemsa. A pattern of light and dark bands 
develops on each chromosome, so that the chromosomes can 
be identified individually. The FISH technique can be 
used with a DNA sequence as short as 500 or 600 bases. 
However, clones larger than 1,000 bases have a higher 
likelihood of binding to a unique chromosomal location 
with sufficient signal intensity for simple detection. 
Preferably 1,000 bases, and more preferably 2,000 bases 
will suffice to get good results at a reasonable amount 
of time. For a review of this technique, see Verma et 
al . (Human Chromosomes : A Manual of Basic Techniques 
(Pergamon Press, New York, 1988)). 

Reagents for chromosome mapping can be used 
individually to mark a single chromosome or a single site 
on that chromosome, or panels of reagents can be used for 
marking multiple sites and/or multiple chromosomes. 
Reagents corresponding to noncoding regions of the genes 
actually are preferred for mapping purposes. Coding 
sequences are more likely to be conserved within gene 
families, thus increasing the chance of cross 
hybridizations during chromosomal mapping. 

Once a sequence has been mapped to a precise 
chromosomal location, the physical position of the 
sequence on the chromosome can be correlated with genetic 
map data. (Such data are found, for example, in V. 
McKusick, Mendelian Inheritance in Man, available on-line 
through Johns Hopkins University Welch Medical Library) . 
The relationship between genes and disease, mapped to the 
, same chromosomal region, can then be identified through 
linkage analysis (co- inheritance of physically adjacent 
genes), described in, e.g., Egeland et al. (1987) Nature 
325 :783-787. 

Moreover, differences in the DNA sequences between 
5 individuals affected and unaffected with a disease 
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associated with the Tango-77 gene can be determined. If 
a mutation is observed in some or all of the affected 
individuals but not in any unaffected individuals, then 
the mutation is likely to be the causative agent of the 
5 particular disease. Comparison of affected and 

unaffected individuals generally involves first looking 
for structural alterations in the chromosomes such as 
deletions or translocations that are visible from 
chromosome spreads or detectable using PCR based on that 
10 DNA sequence. Ultimately, complete sequencing of genes 
from several individuals can be performed to confirm the 
presence of a mutation and to distinguish mutations from 
polymorphi sms . 



2 . Tissue Typ ing 

15 The Tango-77 sequences of the present invention 

can also be used to identify individuals from minute 
biological samples. The United States military, for 
example, is considering the use of restriction fragment 
length polymorphism (RFLP) for identification of its 

20 personnel. In this technique, an individual's genomic 
DNA is digested with one or more restriction enzymes, and 
probed on a Southern blot to yield unique bands for 
identification. This method does not suffer from the 
current limitations of "Dog Tags" which can be lost, 

25 switched, or stolen, making positive identification 
difficult. The sequences of the present invention are 
useful as additional DNA markers for RFLP (described in 
U.S. Patent 5,272,057) . 

Furthermore, the sequences of the present 

30 invention can be used to provide an alternative technique 
which determines the actual base-by-base DNA sequence of 
selected portions of an individual's genome. Thus, the 
Tango-77 sequences described herein can be used to 
prepare two PCR primers from the 5' and 3' ends of the 
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sequences. These primers can then be used to amplify an 
individual's DNA and subsequently sequence it. 

Panels of corresponding DNA sequences from 
individuals, prepared in this manner, can provide unique 
5 individual identifications, as each individual will have 
a unique set of such DNA sequences due to allelic 
differences. The sequences of the present invention can 
be used to obtain such identification sequences from 
individuals and from tissue. The Tango-77 sequences of 
io the invention uniquely represent portions of the human 
genome. Allelic variation occurs to some degree in the 
coding regions of these sequences, and to a greater 
degree in the noncoding regions. It is estimated that 
allelic variation between individual humans occurs with a 
is frequency of about once per each 500 bases. Each of the 
sequences described herein can, to some degree, be used 
as a standard against which DNA from an individual can be 
compared for identification purposes. Because greater 
numbers of polymorphisms occur in the noncoding regions, 
20 fewer sequences are necessary to differentiate 

individuals. The noncoding sequences of SEQ ID NO-.l can 
comfortably provide positive individual identification 
with a panel of perhaps 10 to 1,000 primers which each 
yield a noncoding amplified sequence of 100 bases. If 
25 predicted coding sequences, such as those in SEQ ID NO : 3 , 
SEQ ID NO: 6, or SEQ ID NO: 10 are used, a more appropriate 
number of primers for positive individual identification 

would be 500-2,000. 

If a panel of reagents from Tango-77 sequences 
30 described herein is used to generate a unique 

identification database for an individual, those same 
reagents can later be used to identify tissue from that 
individual. Using the unique identification database, 
positive identification of the individual, living or 
35 dead, can be made from extremely small tissue samples. 
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3. Use of Par tial Tango-77 Sequences in Forensic 

Biology 

DNA-based identification techniques can also be 
used in forensic biology. Forensic biology is a 
5 scientific field employing genetic typing of biological 
evidence found at a crime scene as a means for positively 
identifying, for example, a perpetrator of a crime. To 
make such an identification, PCR technology can be used 
to amplify DNA sequences taken from very small biological 

10 samples such as tissues, e.g., hair or skin, or body 
fluids, e.g., blood, saliva, or semen found at a crime 
scene. The amplified sequence can then be compared to a 
standard, thereby allowing identification of the origin 
of the biological sample. 

-5 The sequences of the present invention can be used 

to provide polynucleotide reagents, e.g., PCR primers, 
targeted to specific loci in the human genome, which can 
enhance the reliability of DNA-based forensic 
identifications by, for example, providing another 

20 "identification marker" (i.e. another DNA sequence that 
is unique to a particular individual) . As mentioned 
above, actual base sequence information can be used for 
identification as an accurate alternative to patterns 
formed by restriction enzyme generated fragments. 

25 Sequences targeted to noncoding regions of SEQ ID NO:l 
are particularly appropriate for this use as greater 
numbers of polymorphisms occur in the noncoding regions, 
making it easier to differentiate individuals using this 
technique. Examples of polynucleotide reagents include 

30 the Tango-77 sequences or portions thereof, e.g., 

fragments derived from the noncoding regions of SEQ ID 
NO:l having a length of at least 20 or 3 0 bases. 

The Tango-77 sequences described herein can 
further be used to provide polynucleotide reagents, e.g., 

35 labeled or labelable probes which can be used in, for 
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example, an in situ hybridization technique, to identify 
a specific tissue, e.g., brain tissue. This can be very 
useful in cases where a forensic pathologist is presented 
with a tissue of unknown origin. Panels of such Tango-77 

5 probes can be used to identify tissue by species and/or 
by organ type. 

In a similar fashion, these reagents, e.g., 
Tango-77 primers or probes can be used to screen tissue 
culture for contamination (i.e., screen for the presence 

lo of a mixture of different types of cells in a culture) . 

C . Predictive Medicine 

The present invention also pertains to the field 
of predictive medicine in which diagnostic assays, 
prognostic assays, pharmacogenomics, and monitoring 
is clinical trails are used for prognostic (predictive) 

purposes to thereby treat an individual prophylactically . 
Accordingly, one aspect of the present invention relates 
to diagnostic assays for determining Tango-77 protein 
and/or nucleic acid expression as well as Tango-77 
20 activity, in the context of a biological sample (e.g., 
blood, serum, cells, tissue) to thereby determine whether 
an individual is afflicted with a disease or disorder, or 
is at risk of developing a disorder, associated with 
aberrant Tango-77 expression or activity. The invention 
25 also provides for prognostic (or predictive) assays for 
determining whether an individual is at risk of 
developing a disorder associated with Tango-77 protein, 
nucleic acid expression or activity. For example, 
mutations in a Tango-77 gene can be assayed in a 
30 biological sample. Such assays can be used for 
prognostic or predictive purpose to thereby 
prophylactically treat an individual prior to the onset 
of a disorder characterized by or associated with 
Tango-77 protein, nucleic acid expression or activity. 
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Another aspect of the invention provides methods 
for determining Tango-77 protein, nucleic acid expression 
or Tango-77 activity in an individual to thereby select 
appropriate therapeutic or prophylactic agents for that 
s individual (referred to herein as "pharmacogenomics" ) . 
Pharmacogenomics allows for the selection of agents 
(e.g., drugs) for therapeutic or prophylactic treatment 
of an individual based on the genotype of the individual 
(e.g., the genotype of the individual examined to 
10 determine the ability of the individual to respond to a 
particular agent.) 

Yet another aspect of the invention pertains to 
monitoring the influence of agents (e.g., drugs or other 
compounds) on the expression or activity of Tango-77 in 
is clinical trials. 

These and other agents are described in further 
detail in the following sections. 



1 . Diagnostic Assays 

An exemplary method for detecting the presence or 
20 absence of Tango-77 in a biological sample involves 
obtaining a biological sample from a test subject and 
contacting the biological sample with a compound or an 
agent capable of detecting Tango-77 protein or nucleic 
acid (e.g., mRNA, genomic DNA) that encodes Tango-77 
25 protein such that the presence of Tango-77 is detected in 
the biological sample. A preferred agent for detecting 
Tango-77 mRNA or genomic DNA is a labeled nucleic acid 
probe capable of hybridizing to Tango-77 mRNA or genomic 
DNA. The nucleic acid probe can be, for example, a full- 
30 length Tango-77 nucleic acid, such as the nucleic acid of 
SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO : 6 , SEQ ID NO: 10 or a 
portion thereof, such as an oligonucleotide of at least 
15, 30, 50, 100, 250 or 500 nucleotides in length and 
sufficient to specifically hybridize under stringent 
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conditions to Tango-77 mRNA or genomic DNA. Other 
suitable probes for use in the diagnostic assays of the 
invention are described herein. 

A preferred agent for detecting Tango-77 protein 
is an antibody capable of binding to Tango-77 protein, 
preferably an antibody with a detectable label. 
Antibodies can be polyclonal, or more preferably, 
monoclonal. An intact antibody, or a fragment thereof 
(e g Fab or F(ab') 2 ) can be used. The term "labeled", 
3 with regard to the probe or antibody, is intended to 
encompass direct labeling of the probe or antibody by 
coupling (i.e., physically linking) a detectable 
substance to the probe or antibody, as well as indirect 
labeling of the probe or antibody by reactivity with 
s another reagent that is directly labeled. Examples of 
indirect labeling include detection of a primary antibody 
using a f luorescently labeled secondary antibody and end- 
labeling of a DNA probe with biotin such that it can be 
detected with f luorescently labeled streptavidin. The 
20 term "biological sample- is intended to include tissues, 
cells and biological fluids isolated from a subject, as 
well as tissues, cells and fluids present within a 
subject. That is, the detection method of the invention 
can be used to detect Tango-77 mRNA, protein, or genomic 
25 DNA in a biological sample in vitro as well as in vivo. 
For example, in vitro techniques for detection of 
Tango-77 mRNA include Northern hybridizations and m situ 
hybridizations. In vitro techniques for detection of 
Tango-77 protein include enzyme linked immunosorbent 
3 0 assays (ELISAs) , Western blots, immunoprecipitations and 
immunofluorescence. In vitro techniques for detection of 
Tango-77 genomic DNA include Southern hybridizations. 
Furthermore, in vivo techniques for detection of Tango-77 
protein include introducing into a subject a labeled 
-.s anti-Tango-77 antibody. For example, the antibody can be 
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labeled with a radioactive marker whose presence and 
location in a subject can be detected by standard imaging 
techniques . 

In one embodiment, the biological sample contains 
5 protein molecules from the test subject. Alternatively, 
the biological sample can contain mRNA molecules from the 
test subject or genomic DNA molecules from the test 
subject. A preferred biological sample is a peripheral 
blood leukocyte sample isolated by conventional means 
10 from a subject. 

In another embodiment, the methods further involve 
obtaining a control biological sample from a control 
subject, contacting the control sample with a compound or 
agent capable of detecting Tango-77 protein, mRNA, or 
genomic DNA, such that the presence of Tango-77 protein, 
mRNA or genomic DNA is detected in the biological sample, 
and comparing the presence of Tango-77 protein, mRNA or 
genomic DNA in the control sample with the presence of 
Tango-77 protein, mRNA or genomic DNA in the test sample. 

The invention also encompasses kits for detecting 
the presence of Tango-77 in a biological sample (a test 
sample) . Such kits can be used to determine if a subject 
is suffering from or is at increased risk of developing a 
disorder associated with aberrant expression of Tango-77 
25 (e.g., an immunological disorder). For example, the kit 
can comprise a labeled compound or agent capable of 
detecting Tango-77 protein or mRNA in a biological sample 
and means for determining the amount of Tango-77 in the 
sample (e.g., an anti-Tango-77 antibody or an 
30 oligonucleotide probe which binds to DNA encoding 
Tango-77, e.g., SEQ ID NO:l or SEQ ID NO:3 or SEQ ID 
N0:6, or SEQ ID NO:10). Kits may also include 
instruction for observing that the tested subject is 
suffering from or is at risk of developing a disorder 
35 associated with aberrant expression of Tango-77 if the 



20 



WO 99/06426 



PCT/US98/16102 



- 73 - 

amount of Tango- 77 protein or mRNA is above or below a 

normal level . 

For antibody-baaed kits, the kit may comprise, for 
example: (1) a first antibody (e.g., attached to a solid 
5 support) which binds to Tango-77 protein; and, optionally 
(2) a second, different antibody which binds to Tango-77 
protein or the first antibody and is conjugated to a 
detectable agent. 

For oligonucleotide-based kits, the kit may 
L o comprise, for example: (1) an oligonucleotide, e.g., a 
detectably labelled oligonucleotide, which hybridizes to 
a Tango-77 nucleic acid sequence or (2) a pair of primers 
useful for amplifying a Tango-77 nucleic acid molecule; 
The kit may also comprise, e.g., a buffering 
is agent, a preservative, or a protein stabilizing agent. 
The kit may also comprise components necessary for 
detecting the detectable agent (e.g., an enzyme or a 
substrate) . The kit may also contain a control sample or 
a series of control samples which can be assayed and 
20 compared to the test sample contained. Each component of 
the kit is usually enclosed within an individual 
container and all of the various containers are within a 
single package along with instructions for observing 
whether the tested subject is suffering from or is at 
25 risk of developing a disorder associated with aberrant 
expression of Tango-77. 

2 . Prognostic Assays 

The methods described herein can furthermore be 
utilized as diagnostic or prognostic assays to identify 
30 subjects having or at risk of developing a disease or 
disorder associated with aberrant Tango-77 expression or 
activity. For example, the assays described herein, such 
as the preceding diagnostic assays or the following 
assays, can be utilized to identify a subject having or 
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at risk of developing a disorder associated with aberrant 
expression or activity. Thus, the present invention 
provides a method in which a test sample is obtained from 
a subject and Tango-77 protein or nucleic acid (e.g., 
mRNA, genomic DNA) is detected, wherein the presence of 
Tango-77 protein or nucleic acid is diagnostic for a 
subject having or at risk of developing a disease or 
disorder associated with aberrant Tango-77 expression or 
activity. As used herein, a "test sample" refers to a 
biological sample obtained from a subject of interest. 
For example, a test sample can be a biological fluid 
(e.g., serum), cell sample, or tissue. 

Furthermore, the prognostic assays described 
herein can be used to determine whether a subject can be 
administered an agent (e.g., an agonist, antagonist, 
peptidomimetic, protein, peptide, nucleic acid, small 
molecule, or other drug candidate) to treat a disease or 
disorder associated with aberrant Tango-77 expression or 
activity. For example, such methods can be used to 
determine whether a subject can be effectively treated 
with a specific agent or class of agents (e.g., agents of 
a type which decrease Tango-77 activity) . Thus, the 
present invention provides methods for determining 
whether a subject can be effectively treated with an 
agent for a disorder associated with aberrant Tango-77 
expression or activity in which a test sample is obtained 
and Tango-77 protein or nucleic acid is detected (e.g., 
wherein the presence of Tango-77 protein or nucleic acid 
is diagnostic for a subject that can be administered the 
agent to treat a disorder associated with aberrant 
Tango-77 expression or activity) . 

The methods of the invention can also be used to 
detect genetic lesions or mutations in a Tango-77 gene, 
thereby determining if a subject with the lesioned gene 
is at risk for a disorder characterized by aberrant 
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inflammation. In preferred embodiments, the methods 
include detecting, in a sample of cells from the subject, 
the presence or absence of a genetic lesion or mutation 
characterized by at least one of an alteration affecting 
5 the integrity of a gene encoding a Tango- 77 -protein, or 
the mis-expression of the Tango-77 gene. For example, 
such genetic lesions or mutations can be detected by 
ascertaining the existence of at least one of: 1) a 
deletion of one or more nucleotides from a Tango-77 gene; 
10 2) an addition of one or more nucleotides to a Tango-77 
gene; 3) a substitution of one or more nucleotides of a 
Tango-77 gene; 4) a chromosomal rearrangement of a 
Tango-77 gene; 5) an alteration in the level of a 
messenger RNA transcript of a Tango-77 gene; 6) an 
is aberrant modification of a Tango-77 gene, such as of the 
methylation pattern of the genomic DNA; 7) the presence 
of a non-wild type splicing pattern of a messenger RNA 
transcript of a Tango-77 gene; 8) a non-wild type level 
of a Tango-77-protein; 9) an allelic loss of a Tango-77 
20 gene, and 10) an inappropriate post-translational 

modification of a Tango- 77 -protein. As described herein, 
there are a large number of assay techniques known in the 
art which can be used for detecting lesions or mutations 
in a Tango-77 gene. A preferred biological sample is a 
25 peripheral blood leukocyte sample isolated by 
conventional means from a subject. 

In certain embodiments, detection of the lesion 
involves the use of a probe/primer in a polymerase chain 
reaction (PCR) (see, e.g., U.S. Patent Nos. 4,683,195 and 
30 4,683,202), such as anchor PCR or RACE PCR, or, 

alternatively, in a ligation chain reaction (LCR) (see, 
e.g., Landegran et al. (1988) Science 241:1077-1080; and 
Nakazawa et al . (1994) Proc. Natl. Acad. Sci. USA 91:360- 
364) , the latter of which can be particularly useful for 
35 detecting point mutations in the Tango-77-gene (see, 
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e.g., Abravaya et al. (1995) Nucleic Acids Res. 23:675- 
682) . This method can include the steps of collecting a 
sample of cells from a patient, isolating nucleic acid 
(e.g., genomic, mRNA or both) from the cells of the 
5 sample, contacting the nucleic acid sample with one or 
more primers which specifically hybridize to a Tango-77 
gene under conditions such that hybridization and 
amplification of the Tango-77 -gene (if present) occurs, 
and detecting the presence or absence of an amplification 

10 product, or detecting the size of the amplification 

product and comparing the length to a control sample. It 
is anticipated that PCR and/or LCR may be desirable to 
use as a preliminary amplification step in conjunction 
with any of the techniques used for detecting mutations 

is described herein. 

Alternative amplification methods include: self 
sustained sequence replication (Guatelli et al . (1990) 
Proc. Natl. Acad. Sci. USA 87:1874-1878), transcriptional 
amplification system (Kwoh, et al . (1989) Proc. Natl. 

20 Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi 
et al. (1988) Bio /Technology 6:1197), or any other 
nucleic acid amplification method, followed by the 
detection of the amplified molecules using techniques 
well known to those of skill in the art. These detection 

25 schemes are especially useful for the detection of 

nucleic acid molecules if such molecules are present in 
very low numbers. 

In an alternative embodiment, mutations in a 
Tango-77 gene from a sample cell can be identified by 

30 alterations in restriction enzyme cleavage patterns. For 
example, sample and control DNA is isolated, amplified 
(optionally) , digested with one or more restriction 
endonucleases, and fragment length sizes are determined 
by gel electrophoresis and compared. Differences in 

35 fragment length sizes between sample and control DNA 
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indicates mutations in the sample DNA. Moreover, the use 
of sequence specific ribozymes (see, e.g., U.S. Patent 
No. 5,498,531) can be used to score for the presence of 
specific mutations by development or loss of a ribozyme 

cleavage site. 

In other embodiments, genetic mutations in 
Tango- 77 can be identified by hybridizing a sample and 
control nucleic acids, e.g., DNA or RNA, to high density 
arrays containing hundreds or thousands of 
oligonucleotides probes (Cronin et al . (1996) Human 
Mutation 7:244-255; Kozal et al. (1996) Nature Medicine 
2:753-759). F° r example, genetic mutations in Tango-77 
can be identified in two-dimensional arrays containing 
light -generated DNA probes as described in Cronin et al . 
i supra. Briefly, a first hybridization array of probes 
can be used to scan through long stretches of DNA in a 
sample and control to identify base changes between the 
sequences by making linear arrays of sequential 
overlapping probes. This step allows the identification 
0 of point mutations. This step is followed by a second 
hybridization array that allows the characterization of 
specific mutations by using smaller, specialized probe 
arrays complementary to all variants or mutations 
detected. Each mutation array is composed of parallel 
5 probe sets, one complementary to the wild-type gene and 
the other complementary to the mutant gene. 

In yet another embodiment, any of a variety of 
sequencing reactions known in the art can be used to 
directly sequence the Tango-77 gene and detect mutations 
30 by comparing the sequence of the sample Tango-77 with the 
corresponding wild- type (control) sequence. Examples of 
sequencing reactions include those based on techniques 
developed by Maxim and Gilbert ((1977) Proc. Natl. Acad. 
Sci. USA 74:560) or Sanger ((1977) Proc. Natl. Acad. 
3S Sci. USA 74:5463). It is also contemplated that any of a 
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variety of automated sequencing procedures can be 
utilized when performing the diagnostic assays ((1995) 
Bio/Techniques 19:448), including sequencing by mass 
spectrometry (see, e.g., PCT Publication No. WO 94/16101; 
5 Cohen et al . (1996) Adv. Chromatogr. 36:127-162; and 
Griffin et al . (1993) Appl. Biochem. Biotechnol. 38:147- 
159) . 

Other methods for detecting mutations in the 
Tango-77 gene include methods in which protection from 

10 cleavage agents is used to detect mismatched bases in 
RNA/RNA or RNA/DNA he ter ©duplexes (Myers et al . (1985) 
Science 230:1242). In general, the technique of 
"mismatch cleavage" entails providing heteroduplexes 
formed by hybridizing (labeled) RNA or DNA containing the 

is wild-type Tango-77 sequence with potentially mutant RNA 
or DNA obtained from a tissue sample. The double- 
stranded duplexes are treated with an agent which cleaves 
single- stranded regions of the duplex such as which will 
exist due to basepair mismatches between the control and 

20 sample strands. RNA/DNA duplexes can be treated with 
RNase to digest mismatched regions, and DNA/ DNA hybrids 
can be treated with SI nuclease to digest mismatched 
regions. In other embodiments, either DNA/ DNA or RNA/DNA 
duplexes can be treated with hydroxylamine or osmium 

25 tetroxide and with piperidine in order to digest 

mismatched regions. After digestion of the mismatched 
regions, the resulting material is then separated by size 
on denaturing polyacrylamide gels to determine the site 
of mutation. See, e.g., Cotton et al. (1988) Proc. Natl. 

30 Acad. Sci. USA 85:4397; Saleeba et al. (1992) Methods 
Enzymol. 217:286-295. In a preferred embodiment, the 
control DNA or RNA can be labeled for detection. 

In still another embodiment, the mismatch cleavage 
reaction employs one or more proteins that recognize 

35 mismatched base pairs in double- stranded DNA (so called 
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H DNA mismatch repair" enzymes) in defined systems for 
detecting and mapping point mutations in Tango-77 cDNAs 
obtained from samples of cells. For example, the mutY 
enzyme of E. coli cleaves A at G/A mismatches and the 
5 thymidine DNA glycosylase from HeLa cells cleaves T at 
G/T mismatches (Hsu et al . (1994) Carcinogenesis 15:1657- 
1662) . According to an exemplary embodiment/ a probe 
based on a Tango-77 sequence, e.g., a wild-type Tango-77 
sequence, is hybridized to a cDNA or other DNA product 
io from a test celKs). The duplex is treated with a DNA 
mismatch repair enzyme, and the cleavage products, if 
any, can be detected from electrophoresis protocols or 
the like. See, e.g., U.S. Patent No. 5,459,039. 
In other embodiments, alterations in 
is electrophoretic mobility will be used to identify 

mutations in Tango-77 genes. For example, single strand 
conformation polymorphism (SSCP) may be used to detect 
differences in electrophoretic mobility between mutant 
and wild type nucleic acids (Orita et al . (1989) Proc. 
20 Natl. Acad. Sci. USA 86:2766; see also Cotton (1993) 

MUtat. Res. 285:125-144; Hayashi (1992) Genet Anal Tech 
Appl 9:73-79). Single -stranded DNA fragments of sample 
and control Tango-77 nucleic acids will be denatured and 
allowed to renature. The secondary structure of single- 
25 stranded nucleic acids varies according to sequence, and 
the resulting alteration in electrophoretic mobility 
enables the detection of even a single base change. The 
DNA fragments may be labeled or detected with labeled 
probes. The sensitivity of the assay may be enhanced by 
30 using RNA (rather than DNA) , in which the secondary 

structure is more sensitive to a change in sequence. In 
a preferred embodiment, the subject method utilizes 
heteroduplex analysis to separate double stranded 
heteroduplex molecules on the basis of changes in 
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electrophoretic mobility (Keen et al. (1991) Trends Genet 

7:5} . 

In yet another embodiment, the movement of mutant 
or wild-type fragments in polyacryl amide gels containing 
s a gradient of denaturant is assayed using denaturing 
gradient gel electrophoresis { DGGE ) (Myers et al . (1985) 
Nature 313:495) . When DGGE is used as the method of 
analysis, DNA will be modified to insure that it does not 
completely denature, for example by adding a GC clamp of 

10 approximately 40 bp of high-melting GC-rich DNA by PCR. 
in a further embodiment, a temperature gradient is used 
in place of a denaturing gradient to identify differences 
in the mobility of control and sample DNA (Rosenbaum and 
Reissner (1987) Biophys. Chem. 265:12753). 

15 Examples of other techniques for detecting point 

mutations include, but are not limited to, selective 
oligonucleotide hybridization, selective amplification, 
or selective primer extension. For example, 
oligonucleotide primers may be prepared in which the 

20 known mutation is placed centrally and then hybridized to 
target DNA under conditions which permit hybridization 
only if a perfect match is found (Saiki et al . (1986) 
Nature 324:163); Saiki et al . (1989) Proc. Natl. Acad. 
Sci. USA 86:6230). Such allele specific oligonucleotides 

25 are hybridized to PCR amplified target DNA or a number of 
different mutations when the oligonucleotides are 
attached to the hybridizing membrane and hybridized with 
labeled target DNA. 

Alternatively, allele specific amplification 

30 technology which depends on selective PCR amplification 
may be used in conjunction with the instant invention. 
Oligonucleotides used as primers for specific 
amplification may carry the mutation of interest in the 
center of the molecule (so that amplification depends on 

35 differential hybridization) (Gibbs et al . (1989) Nucleic 
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Acids Res. 17:2437-2448) or at the extreme 3' end of one 
primer where, under appropriate conditions, mismatch can 
prevent or reduce polymerase extension (Prossner (1993) 
Tibtech 11:238). In addition, it may be desirable to 
5 introduce a novel restriction site in the region of the 
mutation to create cleavage-based detection (Gasparini et 
al. (1992) Mol. Cell Probes 6:1). It is anticipated that 
in certain embodiments amplification may also be 
performed using Taq ligase for amplification (Barany 
10 (1991) Proc. Natl. Acad. Sci USA 88:189). In such cases, 
ligation will occur only if there is a perfect match at 
the 3' end of the 5' sequence making it possible to 
detect the presence of a known mutation at a specific 
site by looking for the presence or absence of 
is amplification. 

The methods described herein may be performed, for 
example, by utilizing pre-packaged diagnostic kits 
comprising at least one probe nucleic acid or antibody 
reagent described herein, which may be conveniently used, 
20 e.g., in clinical settings to diagnose patients 

exhibiting symptoms or family history of a disease or 
illness involving a Tango- 77 gene. 

Furthermore, any cell type or tissue, preferably 
peripheral blood leukocytes, in which Tango- 77 is 
25 expressed may be utilized in the prognostic assays 
described herein. 

3 . Pharmacoaenomics 

Agents, or modulators which have a stimulatory or 
30 inhibitory effect on Tango-77 activity (e.g., Tango-77 
gene expression) as identified by a screening assay 
described herein can be administered to individuals to 
treat (prophylactically or therapeutically) disorders 
(e.g., acute or chronic inflammation and asthma) 
35 associated with aberrant Tango-77 activity. In 
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conjunction with such treatment, the pharmacogenetics 
(i.e., the study of the relationship between an 
individual's genotype and that individual's response to a 
foreign compound or drug) of the individual may be 
considered. Differences in metabolism of therapeutics 
can lead to severe toxicity or therapeutic failure by 
altering the relation between dose and blood 
concentration of the pharmacologically active drug. Thus, 
the pharmacogenetics of the individual permits the 
selection of effective agents (e.g., drugs) for 
prophylactic or therapeutic treatments based on a 
consideration of the individual's genotype. Such 
pharmacogenomics can further be used to determine 
appropriate dosages and therapeutic regimens. 
Accordingly, the activity of Tango-77 protein, expression 
of Tango-77 nucleic acid, or mutation content of Tango-77 
genes in an individual can be determined to thereby 
select appropriate agent (s) for therapeutic or 
prophylactic treatment of the individual. 

Pharmacogenomics deals with clinically significant 
hereditary variations in the response to drugs due to 
altered drug disposition and abnormal action in affected 
persons. See, e.g., Linder (1997) Clin. Chem. 
43 (2) :254-266. In general, two types of pharmacogenetic 
conditions can be differentiated. Genetic conditions 
transmitted as a single factor altering the way drugs act 
on the body are referred to as "altered drug action." 
Genetic conditions transmitted as single factors altering 
the way the body acts on drugs are referred to as 
"altered drug metabolism". These pharmacogenetic 
conditions can occur either as rare defects or as 
polymorphisms. For example, glucose -6 -phosphate 
dehydrogenase deficiency (G6PD) is a common inherited 
enzymopathy in which the main clinical complication is 
haemolysis after ingestion of oxidant drugs (anti- 
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malarials, sulfonamides, analgesics, nitrofurans) and 
consumption of fava beans. 

As an illustrative embodiment, the activity of 
drug metabolizing enzymes is a major determinant of both 
5 the intensity and duration of drug action. The discovery 
of genetic polymorphisms of drug metabolizing enzymes 
(e.g., N-acetyltransferase 2 (NAT 2) and cytochrome P450 
enzymes CYP2D6 and CYP2C19) has provided an explanation 
as to why some patients do not obtain the expected drug 
10 effects or show exaggerated drug response and serious 
toxicity after taking the standard and safe dose of a 
drug. These polymorphisms are expressed in two 
phenotypes in the population, the extensive metabolizer 
(EM) and poor metabolizer (PM) . The prevalence of PM is 
is different among different populations. For example, the 
gene coding for CYP2D6 is highly polymorphic and several 
mutations have been identified in PM, which all lead to 
the absence of functional CYP2D6. Poor metabolizers of 
CYP2D6 and CYP2C19 quite frequently experience 
20 exaggerated drug response and side effects when they 
receive standard doses. If a metabolite is the active 
therapeutic moiety, PM shows no therapeutic response, as 
demonstrated for the analgesic effect of codeine mediated 
by its CYP2D6- formed metabolite morphine. The other 
25 extreme are the so called ultra-rapid metabolizers who do 
not respond to standard doses. Recently, the molecular 
basis of ultra-rapid metabolism has been identified to be 
due to CYP2D6 gene amplification. 

Thus, the activity of Tango-77 protein, expression 
30 of Tango-77 nucleic acid, or mutation content of Tango-77 
genes in an individual can be determined to thereby 
select appropriate agent (s) for therapeutic or 
prophylactic treatment of the individual. In addition, 
pharmacogenetic studies can be used to apply genotyping 
35 of polymorphic alleles encoding drug-metabolizing enzymes 
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to the identification of an individual's drug 
responsiveness phenotype. This knowledge, when applied 
to dosing or drug selection, can avoid adverse reactions 
or therapeutic failure and thus enhance therapeutic or 
5 prophylactic efficiency when treating a subject with a 
Tango-77 modulator, such as a modulator identified by one 
of the exemplary screening assays described herein. 

4 - Monitoring of Effe cts During Clinical Trials 
Monitoring the influence of agents (e.g., drugs, 

10 compounds) on the expression or activity of Tango-77 

(e.g., the ability to modulate aberrant inflammation) can 
be applied not only in basic drug screening, but also in 
clinical trials. For example, the effectiveness of an 
agent, as determined by a screening assay as described 

is herein, to increase Tango-77 gene expression, increase 
protein levels, or upregulate Tango-77 activity, can be 
monitored in clinical trials of subjects exhibiting 
decreased Tango-77 gene expression, decreased protein 
levels, or downregulated Tango-77 activity. 

20 Alternatively, the effectiveness of an agent, as 

determined by a screening assay, to decrease Tango-77 
gene expression, decrease protein levels, or downregulate 
Tango-77 activity, can be monitored in clinical trials of 
subjects exhibiting increased Tango-77 gene expression, 

25 increased protein levels, or upregtilated Tango-77 
activity. 

For example, and not by way of limitation, genes, 
including Tango-77, that are modulated in cells by 
treatment with an agent {e.g., compound, drug or small 
30 molecule) which modulates Tango-77 activity (e.g., as 
identified in a screening assay described herein) can be 
identified. Thus, to study the effect of agents on 
cellular proliferation disorders, for example, in a 
clinical trial, cells can be isolated and RNA prepared 
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and analyzed for the levels of expression of Tango- 77 and 
other genes implicated in the disorder. The levels of 
gene expression (i.e., a gene expression pattern) can be 
quantified by Northern blot analysis or RT-PCR, as 
s described herein, or alternatively by measuring the 
amount of protein produced, by one of the methods as 
described herein, or by measuring the levels of activity 
of Tango -7 7. or other genes. In this way, the gene 
expression pattern can serve as a marker, indicative of 

10 the physiological response of the cells to the agent. 
Accordingly, this response state may be determined 
before, and at various points during, treatment of the 
individual with the agent. 

In a preferred embodiment, the present invention 

is provides a method for monitoring the effectiveness of 
treatment of a subject with an agent (e.g., an agonist, 
antagonist, peptidomimetic, protein, peptide, nucleic 
acid, small molecule, or other drug candidate identified 
by the screening assays described herein) comprising the 

20 steps of (i) obtaining a pre-administration sample from a 
subject prior to administration of the agent; (ii) 
detecting the level of expression of a Tango-77 protein, 
mRNA, or genomic DNA in the preadministration sample; 
(iii) obtaining one or more post -administration samples 

25 from the subject; (iv) detecting the level of expression 
or activity of the Tango-77 protein, mRNA, or genomic DNA 
in the post -administration samples; (v) comparing the 
level of expression or activity of the Tango-77 protein, 
mRNA, or genomic DNA in the pre-administration sample 

30 with the Tango-77 protein, mRNA, or genomic DNA in the 
post administration sample or samples; and (vi) altering 
the administration of the agent to the subject 
accordingly. For example, increased administration of 
the agent may be desirable to increase the expression or 

35 activity of Tango-77 to higher levels than detected, 



WO 99/06426 



PCT/US98/16102 



- 91 - 

i.e., to increase the effectiveness of the agent. 
Alternatively, decreased administration of the agent may 
be desirable to decrease expression or activity of 
Tango-77 to lower levels than detected, i.e., to decrease 
s the effectiveness of the agent. 

C- Methods of Treatment 
The present invention provides for both 
prophylactic and therapeutic methods of treating a 
subject at risk of (or susceptible to) developing or 

10 having a disorder associated with aberrant Tango-77 
expression or activity. Alternatively, disorders 
associated with aberrant IL-1 production can be treated 
with Tango-77. Such disorders include acute and chronic 
inflammation, asthma, some classes of arthritis, 

is autoimmune diabetes, systemic lupus erythematosus and 
inflammatory bowel disease. 

1. Prophylactic Methods 
In one aspect, the invention provides a method for 
preventing in a subject, a disease or condition 

20 associated with an aberrant Tango-77 expression or 

activity (or aberrant IL-1 expression or activity), by 
administering to the subject an agent which modulates 
Tango-77 expression or at least one Tango-77 activity. 
Subjects at risk for a disease which is caused or 

25 contributed to by aberrant Tango-77 expression or 
activity can be identified by, for example, any or a 
combination of diagnostic or prognostic assays as 
described herein. Administration of a prophylactic agent 
can occur prior to the manifestation of symptoms 

30 characteristic of the Tango-77 aberrancy, such that a 
disease or disorder is prevented or, alternatively, 
delayed in its progression. Depending on the type of 
Tango-77 aberrancy, for example, a Tango-77 agonist or 
Tango-77 antagonist agent can be used for treating the 



WO 99/06426 



PCTYUS98/161Q2 



- 92 - 

subject- The appropriate agent can be determined based 
on screening assays described herein. 



2 . Therapeutic Methods 

Another aspect of the invention pertains to 
5 methods of modulating Tango-77 expression or activity for 
therapeutic purposes. The modulatory method of the 
invention involves contacting a cell with an agent that 
modulates one or more of the activities of Tango-77 
protein activity associated with the cell. An agent that 
10 modulates Tango-77 protein activity can be an agent as 
described herein, such as a nucleic acid or a protein, a 
naturally-occurring cognate ligand of a Tango-77 protein, 
a peptide, a Tango-77 peptidomimetic, or other small 
molecule. In one embodiment, the agent stimulates one or 
is more of the biological activities of Tango-77 protein. 
Examples of such stimulatory agents include active 
Tango-77 protein and a nucleic acid molecule encoding 
Tango-77 that has been introduced into the cell. In 
another embodiment, the agent inhibits one or more of the 
20 biological activities of Tango-77 protein. Examples of 
such inhibitory agents include antisense Tango-77 nucleic 
acid molecules and anti-Tango-77 antibodies. These 
modulatory methods can be performed in vitro (e.g., by 
culturing the cell with the agent) or, alternatively, in 
25 vivo (e.g, by administering the agent to a subject) . As 
such, the present invention provides methods of treating 
an individual afflicted with a disease or disorder 
characterized by aberrant expression or activity of a 
Tango-77 protein or nucleic acid molecule. In one 
30 embodiment, the method involves administering an agent 
(e.g., an agent identified by a screening assay described 
herein), or combination of agents that modulates (e.g., 
upregulates or downregulates) Tango-77 expression or 
activity. In another embodiment, the method involves 
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administering a Tango- 77 protein or nucleic acid molecule 
as therapy to compensate for reduced or aberrant Tango- 77 
expression or activity. 

Stimulation of Tango- 77 activity is desirable in 
situations in which Tango-77 is abnormally downregulated 
and/or in which increased Tango-77 activity is likely to 
have a beneficial effect. Conversely, inhibition of 
Tango-77 activity is desirable in situations in which 
Tango-77 is abnormally upregulated and/or in which 
decreased Tango-77 activity is likely to have a 
beneficial effect . 

This invention is further illustrated by the 
following examples which should not be construed as 
limiting. The contents of all references, patents and 
published patent applications cited throughout this 
application are hereby incorporated by reference. 



EXAMPLES 

Example It Iso lation and Characterization of Human 
Tango- 77 cDNAs 

Cytokine genes IL-la, IL-l/J and IL-lra have been 
found to be closely clustered on chromosome 2, i.e., 
IL-la, IL-ljS and IL-lra are located within 450 kb of each 
other. BAC clones containing IL-la and IL-1/? were used 
to identify other proximal unknown cytokine genes. To do 
this, a BAC clone containing IL-la and IL-1/8 was selected 
from a BAC library (Research Genetics, Huntsville, 
Alabama) using specific primers designed against IL-la 
and IL-10. The DNA from the BAC was extracted and used 
to make a random- sheared genomic library. From this BAC 
library, 4000 clones were selected for sequencing. The 
resulting genomic sequences were then assembled into 
contigs and used to screen proprietary and public data 
bases. One genomic contig was found to contain two 
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segments of sequences which resemble IL-lra. These two 
segments are potential exons of Tango-77 gene. 

Two PCR primers were then designed from the two 
potential exons and used to screen a panel of cDNA 

5 libraries for the expression of a Tango-77 message. A 
cDNA library from TNF-a treated human lung epithelia 
showed a positive band of the predicted size (i.e., if 
the two exons are spliced together) . Using the PCR 
fragment as a probe, a single cDNA clone was isolated 

10 from the same library. This cDNA contains an insert of 
989 bp. The cDNA clone contains three possible open 
reading frames. The first open reading frame encompasses 
534 nucleotides (nucleotides 356-889 of SEQ ID NO:l; SEQ 
ID NO: 3) and encodes a 178 amino acid protein (SEQ ID 

is NO: 2) . This protein may include a predicted signal 
sequence of about 63 amino acids (from amino acid 1 to 
about amino acid 63 of SEQ ID NO: 2 (SEQ ID NO: 4)) and a 
predicted mature protein of about 115 amino acids (from 
about amino acid 64 to amino acid 178 of SEQ ID NO: 2 (SEQ 

20 ID NO: 5) ) . 

The second putative nucleotide open reading frame 
encompasses 498 nucleotides (nucleotides 389-889 of SEQ 
ID N0:1; SEQ ID NO: 6) and encodes a 167 amino acid 
protein (SEQ ID NO:7) . This protein includes a predicted 
25 signal sequence of about 52 amino acids (from amino acid 
1 to about amino acid 52 of SEQ ID NO:7 (SEQ ID NO:8)) 
and a predicted mature protein of about 115 amino acids 
(from about amino acid 53 to amino acid 167 of SEQ ID 
NO:7 (SEQ ID NO: 9) ) . 
30 The third open reading frame (nucleotides 372-889 

of SEQ ID NO:l; SEQ ID NO: 10) encompasses 408 nucleotides 
and encodes a 136 amino acid protein (SEQ ID NO: 11). 
This protein includes a predicted signal sequence of 
about 21 amino acids (from amino acid 1 to about amino 
35 acid 21 of SEQ ID NO: 11 (SEQ ID NO:12)) and a predicted 
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mature protein of about 115 amino acids (from about amino 
acid 22 to amino acid 136 of SEQ ID NO: 11 ( SE Q id 
NO: 13) ) . 

Tango-77 is predicted to be 35% identical to human 
5 IL-lra at the amino acid level. 

Example 2: Expression of -p^ jn „ 1imaT1 

The expression of Tango-77 was analyzed using 
Northern blot hybridization. A PGR generated 989 bp 
Tango-77 product was radioactively labeled with 32 P-dCTP 
J using the Prime- It kit (Stratagene; La Jolla, CA) 

according to the instructions of the supplier. Filters 
containing human mRNA (MTNI and MTNII: Clontech; Palo 
Alto, CA) were probed in ExpressHyb hybridization 
solution (Clontech) and washed at high stringency 
according to manufacturer's recommendations. 

Tango-77 mRNA was not detected in any unstimulated 
tissues (brain, liver, spleen, skeletal muscle, testis, 
pancreas, heart, kidney and peripheral blood leukocytes) 
mRNA on Clontech Northern blots. 

Over 96 cDNA libraries were then tested for the 
presence of Tango-77 using PCR amplification. Only three 
libraries displayed a positive signal. These libraries 
were the TNFa-treated bronchoepithelium, TNFa-treated SSC 
cell line arid anti-CD3-treated T cells. 

Example 3: Characterization of Tanao-77 Pr^ a ^ n 

In this example, the predicted amino acid sequence 
of human Tango-77 protein was compared to the amino acid 
sequence of known protein IL-lra. In addition, the 
molecular weight of the human Tango-77 proteins was 
predicted. 

The human Tango-77 cDNA (Figure 1; SEQ ID NO:l) 
isolated as described above encodes a 178 amino acid 
protein (Figure 1; SEQ ID NO:2) or a 167 amino acid 



WO 99/06426 



PCT/US98/16102 



- 96 - 

protein (Figure 1; SEQ ID NO: 7) or a 136 amino acid 
protein (Figure 1; SEQ ID NO: 11). The signal peptide 
prediction program SIGNALP Optimized Tool (Nielsen et al. 
(1997) Protein Engineering 10:1-6) predicted that 

5 Tango-77 includes a 63 amino acid signal peptide (amino 
acid 1 to about amino acid 63 of SEQ ID NO: 2 (SEQ ID 
NO:4)) preceding the 115 mature protein; or preceding the 
115 mature protein (about amino acid 52 to amino acid 167 
of SEQ ID NO:7 (SEQ ID NO:8)); or preceding the 115 

10 mature protein (about amino acid 21 to amino acid 136 of 
SEQ ID NO:ll;SEQ ID NO:12). 

As shown in Figure 2, Tango-77 has a region of 
homology to IL-lra (SEQ ID NO:14). 

Mature Tango-77 has a predicted MW of about 13 kDa 

is and the predicted MW for the immature Tango-77 is 19.6 
kDa, 18.5 kDa or 15-2 kDa, not including post- 
translational modifications. 

Example 4: Preparation of Tancro-77 Proteins 

Recombinant Tango-77 can be produced in a variety 

20 of expression systems. For example, the mature Tango-77 
peptide can be expressed as a recombinant glutathione- S- 
transf erase (GST) fusion protein in E, coli and the 
fusion protein can be isolated and characterized. 
Specifically, as described above, Tango-77 can be fused 

25 to GST and this fusion protein can be expressed in E. 
coli strain PEB199. Expression of the GST-Tango-77 
fusion protein in PEB199 can be induced with IPTG. The 
recombinant fusion protein can be purified from crude 
bacterial lysates of the induced PEB199 strain by 

30 affinity chromatography on glutathione beads. 
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Example 5: Al ternatively spliced forma of IL-lra anH 
Tango- 77 

Computer program Procrustes (Gelfand et al., 1996, 
Proc. Natl. Acad. Sci. USA, 93:9061-9066) is an alignment 
5 algorithm that predicts the presence of alternatively 
spliced exons for a protein of interest in a stretch of 
genomic DNA. Using the IL-lra sequence, Proscustes was 
used to search for the presence of additional sequences 
that might encode for alternatively spliced forms of IL- 

10 Ira in the two overlapping BAG genomic sequences (see 
Fig. 3 and Fig, 4) . Potential sequences that encode 
variant exons for IL-lra were identified. These 
predicted exons aligned well with the N-terminal region 
of IL-lra, but were not present in Tango-77. The results 

is from Procrustes predicts the existence of more spliced 
forms of IL-lra. 

Furthermore, Procrustes also predicted an 
additional sequence in BAC1 and BAC2 that encodes an 
alternatively spliced exon for Tango-77 (T77-procrustes ; 

20 Fig. 5). This predicted splice variant form of Tango-77, 
T77-procrustes, was aligned with Tango-77 (Fig. 6) and 
with IL-lra and IL-l/J (Fig. 7) . 

PCR primers within this sequence can be used to 
generate a product that can be used to screen a panel of 

25 cDNA libraries using standard techniques. Suitable cDNA 
libraries include libraries made from TNFa- treated 
bronchoepithelium, TNFa- treated SSC cell line and anti- 
CD3- treated T cells. The resulting cDNA clone (s) can be 
isolated from the library and sequenced to identify 

30 additional Tango-77 cDNAs. 
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E quivalents 

Those skilled in the art will recognize, or be 
able to ascertain using no more than routine 
experimentation, many equivalents to the specific 
5 embodiments of the invention described herein. Such 
equivalents are intended to be encompassed by the 
following claims. 
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What is claimed is: 

1. An isolated nucleic acid molecule selected 
from the group consisting of: 

a) a nucleic acid molecule comprising a 
5 nucleotide sequence which is at least 45% identical to 
the nucleotide sequence of SEQ ID NO:l, SEQ ID NO: 3, SEQ 
ID NO: 6, SEQ ID NO: 10, the cDNA insert of the plasmid 
deposited with ATCC as Accession Number 98807, or a 
complement thereof; 

io b) a nucleic acid molecule comprising a fragment 

of at least 300 nucleotides of the nucleotide sequence of 
SEQ ID N0:1, SEQ ID NO:3, SEQ ID NO: 6, SEQ ID NO: 10, the 
cDNA insert of the plasmid deposited with ATCC as 
Accession Number 98807, or a complement thereof; 

is c) nucleic acid molecule which encodes a 

polypeptide comprising the amino acid sequence of SEQ ID 
NO:2, SEQ ID NO:4, SEQ ID N0:5, SEQ ID N0:7, SEQ ID N0:8, 
SEQ ID NO: 9, SEQ ID NO:ll, SEQ ID NO: 12, SEQ ID NO: 13, or 
an amino acid sequence encoded by the cDNA insert of the 

20 plasmid deposited with ATCC as Accession Number 98807; 

d) a nucleic acid molecule which encodes a 
fragment of a polypeptide comprising the amino acid 
sequence of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID 
NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO: 11, SEQ ID 

25 NO: 12, SEQ ID NO: 13, wherein the fragment comprises at 
least 15 contiguous amino acids of SEQ ID NO: 2, SEQ ID 
NO:4, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, 
SEQ ID NO:ll, SEQ ID NO:12, SEQ ID NO:13, or the 
polypeptide encoded by the cDNA insert of the plasmid 

30 deposited with ATCC as Accession Number 98807; and 

e) a nucleic acid molecule which encodes a 
naturally occurring allelic variant of a polypeptide 
comprising the amino acid sequence of SEQ ID NO: 2, SEQ ID 
NO:4, SEQ ID N0:5, SEQ ID N0:7, SEQ ID NO:8, SEQ ID NO:9, 
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SEQ ID NO: 11/ SEQ ID NO: 12, SEQ ID NO: 13, or an amino 
acid sequence encoded by the cDNA insert of the plasmid 
deposited with ATCC as Accession Number 98807, wherein 
the nucleic acid molecule hybridizes to a nucleic acid 
5 molecule comprising SEQ ID NO:l, SEQ ID NO: 3, SEQ ID 
NO: 6, SEQ ID NO: 10, or the complement thereof under 
stringent conditions. 

2. The isolated nucleic acid molecule of claim 
1, which is selected from the group consisting of: 
o a) a nucleic acid comprising the nucleotide 

sequence of SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO: 6, or SEQ 
ID NO: 10 or the cDNA insert of the plasmid deposited with 
ATCC as Accession Number 98807, or a complement thereof; 
and 

L5 b) a nucleic acid molecule which encodes a 

polypeptide comprising the amino acid sequence of SEQ ID 
NO:2, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:8, 
SEQ ID NO:9, SEQ ID NO:ll, SEQ ID NO:l2, SEQ ID NO:13, or 
an amino acid sequence encoded by the cDNA insert of the 

20 plasmid deposited with ATCC as Accession Number 98807. 

3 . The nucleic acid molecule of claim 1 further 
comprising vector nucleic acid sequences. 

4. The nucleic acid molecule of claim 1 further 
comprising nucleic acid sequences encoding a heterologous 

25 polypeptide. 

5. A host cell containing the nucleic acid 
molecule of claim 1. 

6. The host cell of claim 5 which is a mammalian 
host cell. 
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7. A non-human mammalian host cell containing 
the nucleic acid molecule of claim 1. 

8. An isolated polypeptide selected from the 
group consisting of: 

5 a) a fragment of a polypeptide comprising the 

amino acid sequence of SEQ ID NO:2, SEQ ID NO: 4, SEQ ID 
NO:5, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID 
NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, wherein the fragment 
comprises at least 15 contiguous amino acids of SEQ ID 
10 NO:2, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:8, 
SEQ ID NO:9, SEQ ID NOrll, SEQ ID NO:12 # or SEQ ID NO: 13. 

b) a naturally occurring allelic variant of a 
polypeptide comprising the amino acid sequence of SEQ ID 
NO:2, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:8, 

15 SEQ ID NO: 9, SEQ ID NOrll, SEQ ID NO: 12, SEQ ID NO: 13, or 
an amino acid sequence encoded by the cDNA insert of the 
plasmid deposited with ATCC as Accession Number 98807, 
wherein the polypeptide is encoded by a nucleic acid 
molecule which hybridizes to a nucleic acid molecule 

20 comprising SEQ ID NO:l, SEQ ID N0:3, SEQ ID NO:6, SEQ ID 
NO: 10 or the complement thereof under stringent 
conditions; 

c) a polypeptide which is encoded by a nucleic 
acid molecule comprising a nucleotide sequence which is 

25 at least 55% identical to a nucleic acid comprising the 
nucleotide sequence of SEQ ID NO:l, SEQ ID NO: 3, SEQ ID 
NO: 6, or SEQ ID NO: 10. 

9. The isolated polypeptide of claim 8 
comprising the amino acid sequence of SEQ ID NO:2, SEQ ID 

30 NO:4, SEQ ID N0:5, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, 
SEQ ID N0:11, SEQ ID NO:12, SEQ ID NO:13, or an amino 
acid sequence encoded by the cDNA insert of the plasmid 
deposited with ATCC as Accession Number 98807. 
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10. The polypeptide of claim 8 further comprising 
heterologous amino acid sequences. 

11. An antibody which selectively binds to a 
polypeptide of claim 8. 

5 12 . A method for producing a polypeptide selected 

from the group consisting of: 

a) a polypeptide comprising the amino acid 
sequence of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID 
NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:ll, SEQ ID 

io NO: 12, SEQ ID NO: 13, or an amino acid sequence encoded by 
the cDNA insert of the plasmid deposited with ATCC as 
Accession Number 98807; 

b) a fragment of a polypeptide comprising the 
amino acid sequence of SEQ ID NO:2, SEQ ID NO: 4, SEQ ID 

15 NO:5, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID 
NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, or an amino acid 
sequence encoded by the cDNA insert of the plasmid 
deposited with ATCC as Accession Number 98807, wherein 
the fragment comprises at least 15 contiguous amino acids 
0 of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:7, 
SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:ll, SEQ ID N0:12, SEQ 
ID NO: 13, or an amino acid sequence encoded by the cDNA 
insert of the plasmid deposited with ATCC as Accession 
Number 98807; and 
i 5 c) a naturally occurring allelic variant of a 

polypeptide comprising the amino acid sequence of SEQ ID 
NO:2, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:8, 
SEQ ID NO:9, SEQ ID NO:ll, SEQ ID N0:12, SEQ ID N0:13, or 
an amino acid sequence encoded by the cDNA insert of the 
30 plasmid deposited with ATCC as Accession Number 98807, 
wherein the polypeptide is encoded by a nucleic acid 
molecule which hybridizes to a nucleic acid sequence of 
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SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:6, or SEQ ID NO:10 
under stringent conditions; 

comprising culturing the host cell of claim 5 
under conditions in which the nucleic acid molecule is 
5 expressed. 

13. A method for detecting the presence of a 
polypeptide of claim 8 in a sample, comprising: 

a) contacting the sample with a compound which 
selectively binds to a polypeptide of claim 8; and 

10 b) determining whether the compound binds to the 

polypeptide in the sample. 

14. The method of claim 13, wherein the compound 
which binds to the polypeptide is an antibody. 

15. A kit comprising a compound which selectively 
is binds to a polypeptide of claim 8 and instructions for 

use. 

16. A method for detecting the presence of a 
nucleic acid molecule of claim 1 in a sample, comprising 
the steps of ; 

20 a) contacting the sample with a nucleic acid 

probe or primer which selectively hybridizes to the 
nucleic acid molecule; and 

b) determining whether the nucleic acid probe or 
primer binds to a nucleic acid molecule in the sample. 

25 17. The method of claim 16, wherein the sample 

comprises mRNA molecules and is contacted with a nucleic 
acid probe. 
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18 . A kit comprising a compound which selectively 
hybridizes to a nucleic acid molecule of claim 1 and 
instructions for use. 

19. A method for identifying a compound which 
5 binds to a polypeptide of claim 8 comprising the steps 

of: 

a) contacting a polypeptide, or a cell 
expressing a polypeptide of claim 8 with a test compound; 
and 

L0 b) determining whether the polypeptide binds to 

the test compound. 

20. The method of claim 19, wherein the binding 
of the test compound to the polypeptide is detected by a 
method selected from the group consisting of: 

15 a ) detection of binding by direct detecting of 

test compound/polypeptide binding; 

b) detection of binding using a competition 

binding assay; and 

c) detection of binding using an assay for 
20 Tango -77 -mediated signal transductions 

21. A method for modulating the activity of a 
polypeptide of claim 8 comprising contacting a 
polypeptide or a cell expressing a polypeptide of claim 8 
with a compound which binds to the polypeptide in a 

25 sufficient concentration to modulate the activity of the 
polypeptide. 
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22 . A method for identifying a compound which 
modulates the activity of a polypeptide of claim 8, 
comprising: 

a) contacting a polypeptide of claim 8 with a 
test compound; and 

b) determining the effect of the test compound 
on the activity of the polypeptide to thereby identify a 
compound which modulates the activity of the polypeptide 
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'AAGTGAAGATATAATG7ATAGTAGTAATATATAATGTTAGGTGAATTAA 
. *GGAAAT AGAATATATTGGGGAGTAATTATGGGTG7AAAGAAATATAGTA 
GGGAAG7A777AGA777GAGAAAAAAAAAAAGGAA7TTAG7G7AGG7GAA 
MAATAAAAGNANAAGG7TAAAAATTAAAAAAAAATTAAATATAAATAAAT 
AAATAAAAATAAAAATAAAATAAAAAATT7AAAAAATTAAAAAAATATAA 

rATGAGGAGATAGAGAGGATGTGGTGTGAGATGATTGGTTTAATTAGAAA 
ATAGGo. i^GAATAGAGTGGGAAAGTAGAGTTTTGGTAAATGTGGGGGGA 
AGAGGGTAATGTTGTTTGAGTGAAAGAAAAAATGSTiTaTTTTraT* & a a 



GGCATCT3ATGCTCTCAGGCAGGAGTCCACAAATTTTTTTTTGTAAAAGA 
-CAGA7AGTAAA7CT777CAGCG7GAAGAGCA7GAGG7C7C7G7CACAAA 

AATGAG * « i GG^TGTGTTCCAGTAAATCTTGATTACAAAAACAGGCAAGA 
rfSS^^^^^^^^^CC^CTGTAAAGGA 
^*7 A iZT^ CT ^^ OT ^ A ^^'^^^ G ^CACAAGGCTC'^ 
^^PII^^^^C^G^G^GATTGATGAGTGGCAAGTAATTTT 

>Concig3 

GGGGTGTCTGTCTACCATGTGCTCGCAGTTCTGTAATAAATGTTCTCTCA 
A 5?tES£^ AAAA ^^^^ GGA ^^ATAAAAATATTGGAAAGAGAAGAAC 
AGTTTTTAAAATATATATATATATATATATTTTTTTTGAGATGGAGTCT r 
GCTCTGTCGTCCAGGCTGGAGTGCAGTGGCGCAAACTTGGTTCACCACAA 
^E TG E£ TCCCGGGCT ^ GCGA ^CTTCTGCCTCAGCCTCCTGAGTAG 
HI G ^F A ^ CGCCCGCCAC »CGCCCAGCTAATTTTTGTATTTTTA 
oxAGAGaCGAGGTTTTACTATGTTGGCTAGGCTGGTCTCAAACTCCTGAC 
:^T^ A I CTGCCCGCOTGGCC ^CCCAAAGTGCTGGGATTACAGGTGTG 
C^ECS * GGAGG ^^CAGTTTTTTAAATATATTTTTAAAAACACTTGAA 

A AA J^ G i^ G7GTAAAC ™ QA ^^ 

^rE A ^^ A ^ GA ^ CT ^CAACAAACCTATTGTAAAGGTGAGTAAG 
G^GTTATTAOVGAGAAAAGTTTGGGAGCAAAACTGTAAAAAATTATAT 
TTTTGTTG7ATTTTCTAAGAGAAAGAGTATTGTTATGTTCTCCTAACCTC 

TGTTGATTACTACrTTAAGTGATTTCCTTGAGAGCACATGATGATCC 

>Concig4 



A SI? AGTGTG ^ ACCCTGTGGCT AGTGTAGACCAATGATCCTGTCTCAGA 

GT»C7AGCCAAC^CCA7ATCAAGTAC7TGAAACT7TGACT^ 

CTCAGTGTCAGAACCTTTGACCTAGGAACCACCTGTAGTGGTTAACTGCA 

AAGGCATANANCTGATGACCTAAACMAAACC^ 

ISZZ AAG7G7A ^ AAA7AAG ^'^'^^ AG ^^^ G CAGGTAATGCCTT 
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TAAGTCTGCCTTGGCAGGCACTTGCAG * >jTTTGAAAGAATCAGATATAT^ 

AAATTTGTAGTTTAAAATATTTAAGGGAACTCAATTAACTATGCTAGAAA 

AGAGAATTAAGTATTTAGGAGGATTTAATATGGTGTGAAAGTTGTGAAAA 

TCAAAATGGAGACACTAATGTTAAGAAAACCCTGATAAATGGAACCAGGG 

AAAGGCATGAAGATAGAGTTCTCACACTTGTATCCCTGATCATGAAAAAG 
ATCTGC 

>Concig5 

GGGTTTT7 CCGCGTTTTTACCCGAAATCTTCAAGGGATGGGAAAAAGAAA 

ATTGCTAAAAAATCTCGGTTTTTTGGTTTTAAQ^GATATTTACACCNTGG 

ATCCCATTrATTATGTTGTCCCCAAGGTTTTCGGTGGGTTCCCAATCAGT 

TAGCCCCCCTCCACAGTGAAAGCACTTTACTTTATCACCTTCACCTAAAG 

CATAAAATCCAGCTCTTGAAAGCTGCTCCTTGTTAACTGAATATATCCAC 

ATCCCAAAAGTAATGATCCATGCTTCATAATCTGCCACGGATGGATGGAT 

GGATGGATGGATGGATGGATGGATGGATGAATGGATGGATTGATTTCTTG 

GAGGATTTGTTGAATTTGGGAAATTCCACGCCAGGACAGCTGGCCCAAAC 

TGCCCGCGACAATCTGCTCGGTACAAGGGGAGGGTCCTGGAGAGGGTGCG 

GCCCGAGCCCCAGTTTGGAAATGCCAACTTGGCTCTGCAGCCGGGCCTTA 

GCCACTTGGGTCTGGCGTCCCTCCATTATTAGCGCCATGCCGGCTCGGGG 

TGCTGCCAAGTCCCTGAGAGCACAAGCC 

>Contig6 

CGCGCTCAAGAAAAGCTGAAGTGTGAATGTTCTGTCTACCTTCACAGTAA 
ATGCTAAGAGAATGACCCAAGAGCAGAGGGTATCACTCTGCTACGGAGGA 
TTGAT7GTAACTGGCTCTCCTGCCTTAGCAAGAAATGCCAGAACCATGGT 
CATTCAAGTTCTTGACCAAAAACTGCCTTCATGAGAATCAACTTCCCCAA 
GAAAAAAAAAGCAGAAACAGGCAAAGCTTCCAGCATGGTAGGTAATACTG 
ACCCpCTTCCCTCCTTCCTTTGGAGATTCACACAGTAATAATGCATAAA 
GCITTGCCAA TGGAC TAAGCACTGCCCAGGGGTTTTTGTCATGCCTGGAC 
TGAAATGCTCTTTTTGCGTTATCATAGAATCCCAGTGCAGTCTGAGTAGA 
CTCTAAGCAAAAGGGACATTTTTCAAAAAGGCTTTAAATTGCTAGTAQ^A 
AGAAGGCAACAAAACTTGCGTAACTGTGGACAGATTAACTCACTTGGTGT 
TTTGGCTCTTCAGTTTTCCCTTGGCTGCGAAGTACTCCTGAAGCTTTCTC 

TGCGGCTCTTCCTGCAAGCAGGCAAGCAAAAAAACGACTGAACTTTATTT 

CGAGAT 

>Contig7 

GAAGAGCCGCTAACTTGCTGTAGTGATAAGGAATGAACTAAGGCTAGGGA 

CATATTAACATCCGCTGGTGGTGACTCTTTAGCCTAGATCTTACCCCACT 

ZCTGCTCCTTCCATATGGTTCGGTCTCAGGCTCACTACCGATCAATGGCG 

TACTAAAAGCAC7AACTATAGACTCCAACACGTCTGTCGTGTGTTTCACG 

ACAAGCCGTGGAGTTAATCCCTCTGACAGTAGCTCAGATAAGGATGGGCT 

ATCATGGGCCCGGAACTGGGGCATGACGCTCGTCACCAACGCATGAGCTC 

CCCAAGTATGCTATACCTGTCCCTATGAAGGGCTTCCAACTCTATGTGCA 

GTCCCCATGTGGAGAGTCAGGTATTGATTGATCAAGCCAGGGGTGTGGTG 

AAT GGGG AGCTTCCTACAGGGGTAATGATAATTGAAATGCACGGTGATGG 

GGATTTTCATATTGGTCTCCTAAGGAGATAACAGATTGGATGCGGGGTCG 

ATATTCCACTGCCCAGGGTGTGTACCGAQGGTATCTGCAGGTGGATCTCC 

TCCCCACGTTTGATTAATACTCCTGTCTTGGGAAGCATAGACGGGCGGGG 

GAAATGATGAAGGGTGACCACTCCCC 

>Contig8 

GGGAACGCAGTGCTCTGTACGATGGCCTTGATTGCGAATTCCTGCAGGGG 
GGG 

>Contig9 

GGCAAG AGATTTAATATTCATTC CATCTTCATTTGGAAGATGAAAAATTG 
GGGACCAGAGAGGGGAGGGGACTGGGCCAAGTTTTCAAAGAAAAGTCAGT 
AGGAATTGTGAATTCCTGGGGGCCGGGGCCCATTAGTGCTGTTTTGGATC 
AGTAAATGGAGATGTGAGTTTCAACAGTAACAGGGACATTTTAAAATTAA 
AATGATTTAACCTTTAGAAAATGTCCTATTTTGTAATAATGATGGATTCA 
CAGGAAGGTACAAAGAAATGTCCAGAGAGTTCNTGAGCCCCCTTCAGCCA 
GCTTCTTCCAATGTTAACATCTTGCATTATTATAGTACAACATCAAAACT 
GGGAAATC3ATATTGGTACTGTCCAGATAGCTTACTCAGATTTTGCCAGT 
TATACTTCCACTCATTTGTGTG7GTGTGTGTGTGTGTGTGTGTGTGTGTG 
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TGTGTGTAGCTCTATGLrtATTTTATGl ^ *G7AGC77CA7G7AAGCA£t**J 

AA7CACAA7AC7TAAC7A7GCCC7CA7CACAAGAC7C7C7C77GC7A7GC 

777ACAGC7d7A7CC7CTTCATC7CCAAACCC7AAGCCCACC7CACCGCC 

TCCACCATCTCTAATCCCTGGCAACCACTATTCTGTGCTCCATCTCTGTA 

ATTAATTGTG7TAATTAATGTTATACAAATGGAATCATGAAG7ATGTGTC 

CTTTGAGATTGGGCTGTTAATTTTTCACTCAGCACAATTTCCGTGAGTCT 

AATCCAACTrGTGTGTAGCAGTAATTCTTTCCTTATTATTGCTGAATAAT 

ATGCCATGGTATGGATGTATCACAGTGTGTCTAATCCTTTGCCCATTGAA 

AGGAATTTGGATAATTTCCAGGTTTTGGCTATTATGAATAAAGTGAACAT 

AAGACATGTGTGTACAAATTTTGGTGTGATCAAAAGTCTCATTTCTCTGG 

GATAAATGCCCGGTAATGAAATGGCTGGGTTGTGTGGG 

?Concigio 

GCAAGAACACAGGCGCGTATTATAACCTTACTACCAAGACCTGAACCCAT 

ATAAAGGTTTATGCGTAACAATCATCATCCCTGTTCCAGAAGATTACACG 

TACGACCACGCCTGGCTCACCGACTCACGTGGGCCAGTACCAGAAATTCT 

CCCAAACAAACAGTCGTGTCTGAAAACAATCGCGGTGACCTCCACGGTTA 

GAAAAGCCTGTTTTCAAGTCCTGGAATTGCCACATATTAGCTGGGTAACT 

TTGGGCATCACATTTACTCTCTCCGAATTTCAGATTGCAAAAACTCATTG 

GATTGTTTTGTGGATTGAAAGAAATAATGTAAATTTAGGCCGAGTGCTTT 

GACTTACGC CTGTAATCCTATCACTTTGGGAGGCCAAAGCAGGAGGGTCA 

CTTGAGCTCAGGAATTTGAGACCACCTCTGGCAACATAGTGAGATCCTGT 

CTCTACAAAAA A TTTTTTTTAAATTATCCAGCATGGTGGTACACGCCTGT 

A77CCCAGC7AC7CAGGAGAC7GAGG7G7GAGGA77GC7AGAACC7GGGA 

GATCAAG7CAACAGTGAGCCGTGGTTGTGCCACTGCCCTCCAACCTCAGT 

GACAGAGGAAGACCCTGTCTCAAAAAAAAAAAAAAAAGTAGTAAGTTTAA 

AGAACTTAGTGTAGGCCTGGCATATAAATGATATTGTTGATGTTGATGTT 

AGCTTGAAGGCACATTTATAGGAGTAGGGAT7TTATAACATTATGAGCCT 

GAGAGCACATATAATGTTCCC 

>Concigll 

GGTCTAACATGCTCCAACTGAAGAAACCCCACACTTGTCCGGCAAGGAAA 

CTACTACAGATTTCCTGACCTACTGTGCAATTCGGGGCATGCGACGGGAC 

TGTGTTTCTGGGTACGCTGTCTCAGGTTCGTCTGGGATGTAAGAATTCAA 

CTTCAGTAGTTCTCTCATAGACGCCGACGAGAGGGGCGTCTCTTTTCTCT 

GATGAATCTGCCAGATCTTCCACTTCATAGAGTCTAAATCCTCCGATTCG 

ATCTACTGGAGACCCCCACGTTACAAAAACG7CTAACGTCGGTGACAGCT 

CCCCACATAGGGAAAGATCACCTGAGTCTCACTACCTCACATTAGTGCTA 

TCTCCAGCCCCATGCTATCTACGAGATGGTCACGCGAGGTTTAAGGGGTC 

7CCGA77CCGG7GG7CCGA7TCAGC7AA7CG7GGCCC7ACG7GAACGA7C 

AC7C77GC7CG7AACA7CGATACAGGG7CGCGC7GACAAATGG7ACTACG 

7AGG77C7CAGGTCAATGCCGCG7CACGAA7GAGCC7AAC7ACCCCATAA 

G7GCACG7AC7GTG7TACC7TTCC7G77CGGCCAAACC7GC7AC7GTATG 

C7G7GC77G777 

>Concigl2 

AGGC7CCA7G7GCTCTAGCCTGAT7A7C7TTTCAAG7G7T77ATTrGC7A 
A7C7A7AAGGCCCCT77CG7AAAA7G77CAC7CA7777C7AA77AGA7A7 
T77TIT7AA7G77GAG7777GAGAG77C777AGA7A7777AGA7ACAAG7 
CCA77G7CAAA7ATGTGATTTACAAA7A7T7TCTCTCAATC7G7AATTTA 
GTTT7CA7CC7CTTAACAGGGTCT7T7GGAGAGCAAATAA77TGATTTTC 
A7AAGG77CAAA77A77AA777777C77G7A7AG77CACAC77C7AG7G7 
7AAG7C7AAAAACTGTGCCTTG7CA7AGG7ACCAAAGGTT77C7CCAG7T 
T77777C7AGAAG777AGAG777CA7G7777ACA77GGAG7CCA7GA7CC 
A7TG77AA77AATTTTTGTA7A7AGG7AGA7G77TAGGTTTAGGG7TTTT 
7TAAAAAAAAATTACA7ATGTTTAAT7GC7CCAGTTCCCTTTCATTGAAA 
AGGG7A7CCTTCCTCCATTGAATTGCC777G7CAGAAAT7AA77GGACA7 
A7T7G7GTGAG7CTA777C7GGGC7C777A7CA7G77AC7777AAAAAA7 
GCA7CAG77CC7CCACCAA7ACC7CA77G7C77GA77A77GCAG77A7A7 
AG7AAGCC77AGCA77AGGAAAAG7G77777CC7GC777A77C777N7CA 
AAAAA77777GGATATrCTAGGGCC7rrACA7A7AAATTTTAAAATAAC7 
77G7C7A7G7C7AACCGAAAGCC77A7GAAGA7777GA7AAGAA7TGCA7 
7A7GCC7A7ACA77AATT7AAAAAGAAC7GA7G7C7TTAT7CAGT7GA77 
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CTGCTAATCTATGAACAImGCATCTCTw . CAAAGCA777AG7C7TTCTT * 

AA77TC7G7CA77AA7TTTITAAAA777TCA7CC7AAAGA 

G7777G77GAA7 TCA7GCT TAAGCA77TCAC7T7C7TGG7AACAA77A7A 

AA7GA7777G7G777TTTA7TCCAC7AG77CA7777CAG7G7G7AGAAAA 

GCAATGAATTTTTGTGTGTTGATCTTTGTTCC7ACATCTTGCAACATTAT 

TGAACTCATTTATTAGTTCTAGGAGGTTTTTTCATTTTTCTTGTAGATAC 

CTTGAGATTTTCTATATAGACAGTCATGTTGTCTGCAAACAGGCACAGTT 



TGCAG7GGC7AGAAC77C7AGCAC7A7G7CAAA7AGCA77GG7GAAAGCA 

GACA7CC77G77CC77G7C7TAGAGGAACA777GG7C777AA7C77GGA7 
TGCG 

>ConcigI3 

GCGCC7CC7T77C7C77CCAAAA777C7C77G7C7AG77A777G7CCAGG 

GAAATTTGAAAGCTCACTTACTGTGCAAGTCAGCAGGAAACAACTGGGTC 

TGTGCACAGCACCTAGCAAAGTTCTGCTCTAGGAATTACACTTTGGCCCT 

GAGGTAGATTTCTACAAGAACCTTACCTTCTAAGCAGCACTGGGGTTCAT 

CTTTTTCCCAGTCCTCAGAGCCCATTTTCACTCCTGAGTTCTCCCCCACA 

AAGGACATTTTCAACGTTGAGTTTATTACTCAACAGAAAATGGAATGAAG 

TCCAAGACCTAAGGAGATAGAAAGGGGACCAGTTATGGCATCTTCTCACC 

CCAGGACACCTTGCTGCATGTCTCTAGTGCTGAACAGACCACTGGCCTTG 

C7C7G7AG777GAAA7GC7CGCTGCAACCAGAAAGGCACCAAGGGGCCAG 

ACCATGCTCTCCTGTCTATCACGCCTTCAAAGCAGAATTTCCCAAACCTT 

GAG7CACAGTGCTAACACACGGGGTGCCATAACATTTTTGTTGATTTTGG 

CATTTTACAAAAATAAAATAAAAAAGTTAAAAATGCATTGCTCTATTCT7 

GGGGCTGGCACACTATTGCCTT7GGCCAAATCCGGTCCCTGACTGTTTTT 

TTAAATAAAGTTTTATTGAAACACAACCATGCTCTTGTGTACATATTGTC 

TCTTGGCTGCTTCGAAGCTACAATA 

>Concigl4 

GTGTTCGCTTTTTAACACTTACCTAAAATTACTCTGTAATCCATGGATCC 
TTAATTTATTTAAAAAACTAATGTTAATGAGTAGCTTTATTTTCCTCCCA 
TCTAATTTA AGGC CCACAGAACACCTTCACTTACCTCAATCCTCTCCCAA 
CTTACATGCTTTTAATGTCATATATGTTAATACCGTATACTTTTAAAACT 
TTCTAAAATAGCATTATTTTATAGCATGAGTGTTCATTTACATTTTTGCA 
TATATTTAGAAT TTTC TTTGCTCTTCGTTTCTTCTTCTATTTATGACTCC 
CCTCTGGGATCATTTTCCTTCTACTTGAAGTACATAGTTTAGAACTGCAC 
TATTCAATACAGTAGCCACTAGCCATGTGTAGCTATTGAAG7TTAAACTA 
AGTAAAATTGAGTAATATTAAAAACTCAGTTCCTTCATCTCACTAGCCAC 
ATT7CAAG7GCTCAGCAGCCACGTGCGACTAATGACTACTGTACATCAAA 
CA7A7AGAACA7T7CCA7CA7GGCAAAGAGC7CTA77GA7AG7G7TCA7C 
vAGAG77TC7G77CCAGGACCAAAC7GAGGG77GGGC7GC7A777C7CA7 
GGCCCAA7AACAAGATGCAGATGAGCTGGGGAGGAAGAGAGTr777A77T 
C7GCNACCA777 ACCGG GAGAAGGCCTGGAAA7CATCACCAGGCCAAC7C 
AAAA77A77ACG7TTTCCAGAGCTTATATACCTTCTAAGCTA7A7G7C7A 
CG7G7AAG7G7GCA7TCACCTGAAGACGT7AGTGA7TAACTTCTTTTAAT 
C7G7AAC7AAGG7C7GAGTCCGGAAGA7C7TCCCC7GGAGCC7CAG7AAA 
T77AC77AA7C7AAA7GGG7CCAGG7GC7GGGG7AA77ACCC77A7C77G 
7CCCC7GC7AAA7CA7GGAGGTT7GGGGATTCC7TTAGAGCACCAA7AAA 
C77G777G7GGAGGCCTGGGGGTTTCTTC7GACCCACAATAAAAC7TG7T 
TAA7C C7AAA7GGG7CC7G77AAGAA77CC77C777A7777G7CA7A777 
7AAGGCCCAGAAAAGGCCTGGGCAAAACTC7TGA7GGGCT7T7G77ACAT 
7CCAGCC777G7A7AAGAACAC7GGTTT77AA7A77TAAC7TAACCAT77 
AG7CAG7AC7GAAACAGTTGTTA7AGAGATC7GCA77AGTGAGACCTGGC 
C7GCCACA777CC7T77CTGAAGA7C7TA7GG7AG7GA7CACC777G7GA 
AAGGAAAA7AAATCT7GGGACC7CAAAA7CAC7AAGCCAAAGAAAAAAG7 
CAAGC7GGGAAGAA7CTGACAC77AAATCCAACACTGC7AACTCA77CA7 
C7CAC7CA77CA77CA7T7TAT777C77777TCTT7C7TT7Tr77777T7 
777777GAAACGAAGTCT7GCTC7G7CACCCAAGC7GGAG7GCAG7GGAT 
C7CAGG7CAC7GCAACC7CCACC7CCCGGG77CAAGCGA77C7CC7ACCT 
CAGAC7CC7GAG7AGC7GGAAT7ACAGGCACC7GCCACCACGCC7GGC7A 
A77777A7A777TTAG7AGAGACGGGGT77CACCA7G7TCA7CAGGCTGG 
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ATTCAACATCATGCTGAA * CCTTTCAA * uATCATCTTGTTTTTAGTAATL 
TCCTACCTTAACTCTCTGTCTTCTGCTAGTATGGGAAAGATGACCTGAAA 
ATCTAACCATTTATTTTTCCCCCATTAATATCATTTTATGATTATTCAGA 
AGTTAAATA ATTGT CATGCTGTCCTCCAAAAAGACTGAATCAACTAGCAA 
CAAATAAGAATTTTCTCACAGCTCTGCCAGCATTTTAAAAGAATAGCTTT 
ATTGAGCCCAGGAGGTCAAGGCTGCAGTGAGCTGTGATTACACCACTCTA 
CCpCAGCCTGGGTGACAGAGCJUlAACCCTGTCTCAAAAAAGAAATTTAAG 
GAACAGCTTTATTGTTGTAAAATAGACATACAATAAACAGAGCACAXATT 
TAAATTGTGCAACTTATACTTTGATATAACCCTGTGAAAACATCACCACA 
ATCAAGATAGTGAATATATTTATCACCTCCTGATACAGTTTAGCTCTGTG 
TCCCCACCTAAGTCTCATGTTGAATTGTAATCCCCAATGCTGGGGGAGGG 
GCTTTGTGGGAGGTGATTGAATTGTGGGGGTGCACTTCCCCCTTGCTGTT 
CTTGAGATAGTGAATGAGCTCTCATGAGCTCCCGTTCACTCACTCTCTTT 
CCTGCTGCCATGTGAGGATGTGCTTGCCTCTTCTTTGCCCTTCTGCCATG 
ATGTGTTTCCTGAGTCCTCCCTAACCATGCCTCCTGTACAGCTTGCAGAA 
CTGTGAGTCAGTTAAATCTCTTTTCTTCATAAATTACCCAGXCTCAGGTG 
GCTCTTTATAGCAGTGTGAAAAGGAACTAATATACCTCCTAAGTTACCTC 
AAGCTTGTTTTTAATTCCTTCTCCTCCCTTCCTTCATTGCCAAGCAAACA 
ACCACCTGTTTTCTGTCACTATAGATTAGTTTACATTTTGTGGGTTTTTT 
TTTTTTTTGAGACAAGGTCTGACTCTGTTGCACAGGAGCAGAGCAGCGTA 
TC 

>Concigl7 

CGCGTTATAGGAGATGCGAACTTAAGAAATGATGATAAGGAGACTTTA7T 
AAATATAATTTTGAATTATTTTGCCATTACAGAAATTCTAATTATTTAAA 
ATTCTATTCATAATTTTTAATCACTGTACTTCCCAAGCTTAGCTTAGAAT 
CCTTCTG7GCTGAGGATTAAT7TTAATTTGTCTTTTATAGGCC7TATCTA 
AAATCCAAGAATAATTGCCAGAATCAACCACCTTCTAAATCTGTAAGTAG 
AAATTAGTCTTTTTAAAAATATGCATTCATAAGTATGATTAGTAATAAAA 
ATAATAAAGATGTTAGCAACCTAAAGAACATGTATTTGAAAGGTATTTCT 
TACAGATATAAAAACAGTTTGGTTTAATAAGAGAaVATaVTTTTTTGAAA 
AGTATGACATTTTTTGAAAAGTAGTTTAGTTTTATTAACCAAGAAAAGCC 
TCAAGTGAACTT TAGTC CTCTTGATAGCTAACATTTATTGAATGCTTACT 
GTGTGCCTGATACTTTTCTGACTTGCATTACCTCACTGAGTCCTCACAAT 
CTTATGAGGCTACTATTAGTAGCCCCACTTTACAGATGAGCAAACTAAGT 
CACAGAAAGGTTAAATAGGTCGTATAGCTATTAAGTGACAAAGCTGAGAG 
CCTGTGATCTTAACCACTTTGGTATGCTGCCATGAAGTTAAATAGCTCAG 
TAGTCATTAAAAGAGAACZATTTGCATTGAACCTTCCAAGCCACTTAACAA 
GTATATGCrTCCTAATCAATTTAATTTAGCTACATTAGATAGAATGGTAA 
AGGATCCTTAACTTAAAGTTTAAATGGAAGAAATTAGCCCTCTGAAAGAG 
GCACAGATTATTCATCTGCAATAAAAATCTCACCTrTAGTTTTTTAAAAC 
ATAGTTTTTATCTGTGTTCTGAAATGTAACTAAAACAGTGCTTCCTGAAG 
TGAAAAATTCTCACTGGTGAGAATTTTAATAAGTTTTAATGATTCACCAA 
ATCACTTCAGTCATATTTCAGTCATATGCATATGCATATATAGACATATA 
AGTTTTTATCTGTGTTCTGAAATGTAACTAAAATAGTGCTTCCTGAAGTG 
AAAAATTCTCACTGGTGAGAATTTTAATAAGJTTTAATGATTQ^CCAAAT 
CACTTCAGTCATATTTCAGTCATATGCATATGCATATGTAGACATATATA 
TGTTGTATGTATACATGACATCATTAGACACTGTGAAGGATAGCAAAATG 
TATATAAGGCAAAAmATGAACAATGGTTTAACGTTTGGGAAGCACTGG 
GTTACACTTTTACTTTATGCAGATTGAACOVGTATAGTATGCAAGTCTTA 
AGGAAAAATCTACTGGAAAGGGCCCTCATTCAGACTTCCCAGAGGCTTCT 
CTGGAAGTTGACAATACTGACTTCAGTACATCAGCTCGTAAATGAGGATG 
ATACCTACCTTATCTGCTTTACACAGTTGTAAAAGTAAAAAGTGAACTCA 
GGAAGGGAATTACAGAATTTAGGAGAAACTAAAAGCACGATGTAAATAAT 
AGTCATCATTACAGTTATATAATGCTTGACAATTTATATAACACTTTCGA 
TACATGACAACAATAACTAACACCCAGACATGTTTATATACATTACCTCA 
CTCAGAACA^CATGTGAGGAAGTTGGCCATATGCTTTAATGTCCAAACC 
AGGACACTTTTGAGAGTAAAAAGCAGTACTCTTTGACCAACAGGCATAAA 
TCAAAACTATCTTGTGAAAACCGGGATATATGGCATCCTTCCTAGATAAT 
AGATACTTTTACTATTATTAATTTTGCTGTGAATCTAAACCTGCTCTAAA 
AAAG7TAATTTTAAAAAGTAATGAAGTACTGATACATGCTACAACATGGG 
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TAAA7C77GAAAACG77*> JTGC7AAG7G. . AGAAGCCAGACAGAAAAGGC*. 

ACATATTACATGATTCCATTTATATGACACATCTAAAATAGGCACATCTA 

TAGACATACAGAGACAGAAAGTAGACTAGCGGTTGCCAAGAACTGCAGGG 

AGCAGAAGATGGGGAGTGACTGCCAATANGAAAACGCATTACGT 

>Concigl8 

tGAATCGCAATGATATGTGCCACTTTGCACTCTCTGTGACATATATAATT 

ATTTTTAATGCATTCATTTTTTTCTCAGAGTGCATTCGTTTGAAAACATA 

GA^GGGAAATACTGGTAGTCTTCCTTGTCAGTTAGAAACACCCAAACAAT 

GAAAAATGAAAAAGTTGCACAAATAGTCTCTAAAAACAATGAAACTATTG 

CCTGAGGAATTGAAGTTTAAAAAGAAGCACATAAGCAACAACAAGGATAA 

TCCTAGAAAACCAGTTCTGCTGACTGGGTGATTTCACTTCTCTTTGCTTC 

CTCATCTGGATTGGCATATTCCTAATATCCCCTCCAGAACTATTTTCCCT 

GTTTGTACTAAACTGTGTATATCATCTGTGTTTGTACATAGACATTAATC 

TGCACTTGTGATCATGGTTTTAGAAATCATCAAGCCTAGGTCAGCACCTT 

TTAGCTTCCTGAGCAATGTGAAATACAACTTTATGAGGATCATCAAATAC 

GAATTCATCCTGAATGACGCCCTCAATCAAAGTATAATTCG&CjCCAATGA 

TCAGTACCTCACGGCTGCTGCATTACATAATCTGGATGAAGCAGGTACAT 

TAAAATGGCACCAGACATTTCTGTCATCCTCCCCTCCTTTCATTTACTTA 

TTTATTTATTTCAATCTTTCTGCTTGCAAAAAACATACCTCTTCAGAGTT 

CTGGG7TGCACAATTCTTCCAGAATAGCTTGAAACACAGCACCCCCATAA 

AAATCCCAAGCCAGGGCAGAAGGTTCAACTAAATCTGGAAGTTCCACAAG 

AGAGAAGTTTCCTATCTTTGAGAGTAAAGGGTTGTGCACAAAGCTAGCTG 

ATG7ACTACCTCTTTGGTTCTTTCAGACATTCTTACCCTCAATTTTAAAA 

CTGAGGAAACTGTCAGACATATTAAATGATTTACTCAGATTTACCCAGAA 

GCCAA7GAAGAACAATCACTCTCCTTTAAAAAGTCTGTTGATCAAACTCA 

CAAGTAACACCAAACCAGGAAGATCTTTATTATCTCTGATAACATATTTG 

TGAGGCAAAACCTCCAATAAGCTACAAATATGGCTTAAAGGATGAAGTTT 

AGTGTCCAAAAACTTTTATCACACACATCCAATTTTCATGGCGGACATGT 

TTTAGTTTCAACAGTATACATATTTTCAAAGGTCCAGAGAGGCAATTTTG 

CAATAAACAAGCAAGACTTTTTCTGATTGGATGCACTTCAGCTAACATGC 

TTTCAACTCTACATCTACAAATTATTTTGTGTTCTATTTTTCTACTTAAT 

ATTATTTCTGCAATTTTCCCAATATTGACATCGTGTATGTATTTGCCATT 

TTTAATATCACTAGACAATTCAATCAGGTTGCTACGTTGGTCCCTTGGGT 

TTACTCTAAATAGCTTGATTGCAAATATCTTTGTATATATTATTGTTTTT 

TCTCCTATCTTGTAATTTCTTTGAGCACATCCCAAAGAGGAATGCCTAGA 

TCAATGGGCACAAATAATTTGACAGCTCTTATTAAACATTATTCTGTAAG 

TAAAAAC7GAACTACTTTTCAGTATCACTAGCAACATATGAGTGTATCAG 

CTTCCTAAACCCCTCCATGTTAGGTCATTATGAACTTATGATCTAACAAA 

TTACAGGGTCTTATCCCACTAATGAAATTATAAGAGATTCAACACTTATT 

CAGCCCCGAAGGATTCATTCAACGTAGAAAATTCTAAGAACATTAACCAA 

GTATTTACC7GCCTAGTGAGTGTGGAAGACATTGTGAAGGACACAAAGAT 

GTA7AGAA77CCA7TCCTGACTTCCAGGTATTTACACCATAGGTGGGGAC 

C7AAC7ACACACACACAC^CACACACACACAaVC^^ 

CA7GCACA CACAA7 C7ACA7CAACAC77GA7777A7ACAAA7AGAA7GAA 

777AC777CrrTTTGG77CTrCTClTG^CCAG7GAAA7TTGACA7GGG7G 

C77A7AAG7CA7CAAAGGA7GA7GC7AAAA77ACCG7GA7TC7AAGAA7C 

7CAAAAACTCAATTGTTTCTGACTGCGCAAGAAGAAAACCACCCATGCTG 

C7GAAAG7CAG77G7CC777GTC7CCAAC777AC77CC777ACC7C7CA7 

A7G777G7GAA7AAGCCCAA7AAGCAGACNCC7CC7ACAAAG7GAACC7G 

G7CTC777CCTCC7AACAGGG 

>Concigl9 

G7C77G7AACACAGGTAAGACGAG77CAAG7TTTA7TTC7TGN7TTTAGA 
ACGG7AG7GAGCGGTTTTCAGCNTGAGACCACACC7AAGGTAAGTAGC7G 
AA7TGGGG7TTTGTCTTGGCTAAAGTTTAACAACCAGCTGG7CTTAATTT 
C7CC 77ACCA7TAGAGCACTCAG TAA7C ATATAAGTTGTGTG ATCATT CA 
77T7GC77AAC7GTTTGTTTCTGTTT7TATTGCTGTTTCAGTCTT7TTCC 
CAT7GGG777GACC7ACTCTATC7GAC77GA7CAAATCCAAAGGAAAT7T 
CCAAA77A7GGGGAATGAGGCCTCTGAAG7GGC7AAA7TCCCACCC7CCC 
ACACACACAAACG7GGTATGGTGGGGGAAAAAACGGCCAGCAAAAGAAAA 
AAAAAAAGGAAAAGA7GTTTCATTITGACCACCAAACGGGCT77A77TAC 

FIG. 3 (7 of 52) 



WO 99/06426 



PCT/US98/16102 



ATAACAAGGCCACCTTT - VGCTAGCCA .CCA7AC7GAAAGAGOTATGL ^ 

TGTTGCCCCATGCTGTGGGTTCCATAGCTAACGTTCTGCCTTTTTTCCTA 

CCACGACAGCCTGGGTTTGGTTCCTAAATCAAGCCTITrCTGGTTTGATA 

CTTGGTAATGCTGAAATAGCAGCAATTTGTCCTAGCTGAAATATCGTAAT 

AAGATTTTAAAAGATTTATTTTAAAGGACCTCAATAGTTAAAAGTCAGCT 

TAATTAAAAGCTAACATCCAAGATGTGTGCATGTGTATGTATGCGTCTTT 

GTAT TTAAATAGCCCTCATGTTTTTTTTTTCTTTCCTAGGAACTTGCCTT 

TTTTTGAGCAAAAGTTTTTTTCTTCTCTGTTGACTGGATTCTGTTTTC7T 

CATTTACTTCTGCTGTCTCTCCTTTCTCTTGCACCGTCTGCTGCATGAGA 

GCCCTAAAATAGTT TATA ATAGCCTGGGGTTCCTTAAAGAAAATGGAGAA 

GGTGCCAGGCTCCCTTTTAGGGAGAAACTTCTATTTTTCCTTATGGAATC 

CCTAGAGTGTAAACAGACAAGTTCATTTCAGCTCTTAAACTGCTTGCGTT 

TGTGTTGTGTTACCTGATTTTTTTGACTATTATATTTTTGACTAGCTATT 

GCAACAGAAGCTACTCTTGGGTTTTCAAGGAAGATTGTAGTTTAGACATG 

TAGAAATGTCTTTTAAAAAAAAAACAAACTTTTTTTTAAGTGCACTGTAA 

AAGCATCATATGGTCTAGCCTCCTAATAATTTTCCCTTTTTGGAGACCAG 

GATTCAGGGTGGGCTCTGCCCAGAGCTCAGAGATCCAGTTAAAAGAGAGG 

TAGTCTCGGCCGGGCGTAGAGGCCCAGCCTGTAATCCCAGCACTTTGGGA 

GGCCGAGGCGGGCGGATCACGAGGTCAGGAGATCGAGACCATCCTGGCCA 

ACATGGTGAAACCCCGTCTCTACTAAAAATACAAAAATTAGCTGGGTGTG 

GTGGCAGGTGCCTGTAGTCCCAGCCACTCGGGAGACTGAGGAAAGAGGAG 

AATCGTTTGAACCCGGGAGGCGGAGCTTGCAGTGAGACGAGATGGCGCCA 

C7GCAC7CCAGCC7GGCGACAG7GAGAC7CCG7C7CAAAAAAAAAAAGA7 

AGG7AGAC7CGA7G7TG7CG7ACCCGAGCAAG77AGAGCAACGCCACAC7 

TTGAGACGAATTTAAGAGTCCTTTATCAGCCGGCGACCAAGAGACGGCTA 

ACGCT CGAAATTCTCTCGGCCCCTTGGAAGGGGCTTGATTTTCCTTTATG 

CTTTGGT7TAGGAAGGGGAGGGGAGCTCAGTTGCAACAATTCTACAGGAG 

TAAAAACATGCAAAGAAATTAAAAAGACAAGTGGTTACAGGGAAACAAAC 

AGTTCCAGGTGCAGGGGCTCTAAATCTATCATAAGATGTTAGGTATGGGG 

GCTCTGCCGGACACAAACTCAAGGCTTTATGCTGTTATCTCTTGAGCGAA 

ATCC TGGGA ACTTCGTACATTGCTTGCTTCAGTACCTTATCAGTTAATCG 

GACTCTTTGATATGTTGGGAGTCAGCGTACACAAGTTAACTCCTTGAGGA 

AGGGGGTGGGTAAGGAGTCCTTGATGTCTGGTAAATGAAGGAGCGAAATC 

GAGTTCCTCTGGCTTTCTCAGCTAAGGGAGAGCTTATTCATGTGGAAACA 

AGGCTAAGTGATTAAGGGAGAAAGGGAGAGTCTGAAAACAAGGTTAGGTA 

TTACAATGTCAATAAAATTGGTCTCCTTATACAGTCCTATGGTAGATTTC 

777CCA7C777AA 7CTC CCTC7AGCACCACCAGAC77777C7C7C7G7AC 

C77GAGA7G7AAA7777GCTA7C7GAA7777CG7C7AAGAG77G777CC7 

TTAATATGCAAATTTAGGGTTATTTAGCTGACAACTGCO\AAGTAGTGAA 

AC^AGTTATCAAGAACTTGAACGTCTAAGGTAGGAAAAAAAAAAGTCTTT 

ATGAATCTATAAGATGTACTTCTATTGGCATGCCTAATACGTCTATGTAT 

TTACG7G7TGTGTACACAGTTTTTCACTACTGAAAATATATAGAGGAGTT 

CTAA77AA77GACTTAAGACAATAAAAGCGCTTGAA7CAAATACC77A7C 

AGGAAAAAGGAAAAGACAAG7CAAA7GC77G77CAAG7C7A7A7AAC77A 

AG7AAAA7C77TAATAAATAAGCTAGCTTrAACA7TAT7TGAAATGTC7T 

AAGAA77GCCAGCAGGTTCTGGGTTACAGAAC7AGTGGGGGTGCAGTGGG 

G7GAGGG77GGTGGGGTGGGNGGTNNNACNNNNNCNCCCCCCCCCCCCCC 

CCCCCCCCCCCCCTCCCCCCCCGCCCCGNGCGGGCCGCGCCCCCCCCCGC 

CCCCCCGGCCCGCCCCCCGCGGCCCCCCACCCCCCCCCCCCCCCCCCCGC 

GCCCCGCCCCCCCCCCCGCGCCCCCCACCCCCCCGCCCCCCCGCCCCCCC 

CCCCCCCCCCCACCCCCCACACCCGGCCCACACGCACCCCCCACCCCGAC 

GCCCCCGCCCCCCCCCCCCCGCAGCCGACGCCCCCCCCCCGCCCGCCCCG 

CCCCGCACCCCCGACCCCCCCCGCCGCCCCGCCCCCGCCCCCCCCCCCCG 

GCCCCCCCCCCGCCGGCGCGGCGCCCCACCCCCCCCCCCCAGCCCCGACC 

GCGCGCCCCCCCCACCCCCCCCCCAGCCCCCGCCCCCCGCCCCGACCC 

>Concig20 

GGCAG7ACGC7A7AA7TCCC7C77CACC77ACC7CA7C7GT7C7C7GA7G 
GA7G7A Cr rT ITTTTTTA GT7TC7AAA77CCC7777CC777GC7C7GGAG 
A7GGG7GA77GA7G7AGTCTGGG7A7T7GT7CCC7CCAAA7C7CA7GTTG 
AAA7G7AA7CCCCAG7GTTGGAGG7AGGGCC7GG7GGGAGG7G77TGGA7 
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CA7GGGGGCAGA7CCC. -A7GAATAGC .'GG7AC7G7CC7C7CAATAG j 
AATGAGTTCTCCTGAGATATGGTTGTTTAAAAGTGTGTGGCACTCCCCCA 
77GC7C7C77G77ACTGC77TCGACATGTGACA7CCC7GC7CCCC77CGC 

- " ^ ^^i^S A ^ CTTGTACAGC CTGCAGAAC7G7GAGCCAAAA7AAAC7 




r5^;'i^^S:^ GGG ^ TGG ^^ T GAGG7CAGGAGA7TGAGACC 




GCACCAC7GCAC7CCAGCCTGGGCGACAAGAGTGAAACTCCATTTAAAAA 
GAAAAAACAAAA7TTCAAACAGAACAAAATGAAAAAAATACCAAGTGAAA 
GGCCCCTA7AAAAACCCCTCTGGGGCCCATCCTCCCACCCCCTCAAGTGA 
AACCACAT77AAOU7T7CX3TGK^TA7CTTTCCAAACCTTT7GTTG7ACA 




• - - - — •- * "nwiti inniaiftin i i UAUATLi 1 TC.TATGTCA 

G7GCA7G77GGCTCGATGATATTCTATCA7TAAA7ACCCT7CCAAAAA7G 

GTAAAATCA7T7TAAAAAA7CAT7CACACAAGTACATA7T7ACAATTrrA 

AAAGAAAACAGAATCCCAAAACACAACGACAAACC7CTAAAAATAATCTC 

7ATC777CCACCAGCATGGAACAGT7CATTCC777T7CACA7AAAACGAA 
TTATGTGATTGGaAAnaTTiarTfps iwi^i /-.»m^.» ... . 




Z77ACAC77AGAAA77AAGTCAATA7AC7A7GAA7ACACATTGTGATCAG 

77A7AA7A7GA7GC77C7TAGTCTAGGG777CAAT7AAATAACAGTAAAA 

AAAA77GGATAAATAAGACAGCTAATAAC7GAAAAA7CCAGAAATTCAAA 

GAT7ATA7TGCCAACTAAAACAC7GCCAT77ACA7TT7TT77TCCTACT7 

GGTAGCAAA7GC7AATGGAATTCAA7CC7GAT7ACT7AAAGTCAGTTCAC 

ATCACACA7TCAATCAGGATAA7ACGAACATAATATGCCTACTATAGCGT 

7AGAT7AAGACA7AAAATTTTTTTGCTTGAAAGTAATGACTGCGTACCAC 

TTGAGACATTTGTCAACCACTTCAGCACATTGTTTACGAGTGACTGGATG 

TCCACAAGGAA7AAAAACGACAGCAA7ATT7C7ATCCATACAGA77T7GC 

AAAGCTTC7CCTCTTGCAGGTGTCTTAGCTGCTCTTCAGTACTAATCTCT 

77C7GCAA7GAAGTCTCACTTGATTCGTCTTGTGTACTGTCTTTCTGAGC 

C7TCAC7GGATC7GCAATCAGAACCTCAAG7GATTTACAGTTGCTCCCAG 

A7G7C7GAAT7T7T7CCTCCATTAT7TTCT7AA7GTC7T7GAAAC7GAAC 

C C GA77CA7A7AGC7TCT7GTACCATAGGAT7A7GGAAGATGG7A7CAA7 

7777C7AG77AG7GA7GGCGTT77T7CAGCAGT7C77ACCAGACACTCCT 

:AAG7GAA7GGGA7AAATGAATATTGTTTATATA7777CG7G7CTTC7GT 

7C7AACAGA7A777ACACCC7GGATGCCA77AACA7G77G7CCCAAGGG7 
C7TNC7GGGCT 

> Con cig21 

C7TTC7CCC7T7TTACCCCCATTTTCGTAGGGATTTGGTTAAAACCCATG 
TAAAAAATCCAAACACCGGCGGGGAACGGGGGTTCAAGCTCGTATCCCCA 
CCAC7T7GGGAACCCAAGGTtMCAGGATTGTCGGAAGCCAGGCATTTGAG 
CCCACCCTTGGGAAAAAAAAGAGAACCCCCATTTTTTTTGAACAAAAACC 
CCAACCC7CCCAGGAAAGAAATAAGTATGGCTGGGTTGAAGTCACCAAAG 
A7GGCCGAC7GGCTGGTCAAGTAACTTTACCTGATGGTTCGTAGAATATT 




TGCA77CAC7AAG7AAAAGTGA7AATAGCTACTT7TAAGTAAAATAATGA 

A7GAA7CAAACACTCTAAATCCATGGTGCTATGCTAAGCTCTTTCTGTAT 

7TTATC7CATT7GATATTACAAATATTTGATGTGTTAATAG7AATGACTA 

7C7CCA77TTTACAAGTAAGGAAACTGACATTGAGAGATTAAAAGACTAG 

CACAAA7CACAAAG7AAATGAGATTTGAATCCGG7C7TGATTCCAAACTC 
TACA.GTATTrTaAATTraa<inanar~riaaT-rfiT» nrjivfi/nw/^ »mm 




C7AGCCAGTGCATCATCTTCCTGTAGGCAAATATGCAGGAAATCTATAAT 
AAGAACG7CC77TGG7GAAGGCCAGG7GCAGGGGCT7ACAC7TGTAATTC 
CAGCAC777GGGAGG7CAAGG7GGGAGGG7CGC77GATGACAGGAGTTTG 
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A.GAACAGCCTjGGaVACAXAGTGAGACv-wTGTCTCTACAAACAARAACAK 

ACACAAAACAACTTCAAGAAAACTCCTTTGGTATGGATCAGAACAAGATG 

AArrATCTATCTGATCCAAATGCTTAATGACATTAAGCCACAGTCCACTC 

ACTGCCACAATAGAGATATACCTGCCAATGCCACTCAGGTAATCCCATCA 

AAAGTGGTAATGAGGTCTGCAGCATGACTTGTTCTTAGTGATCCCAGCCT 

OAGACCTTGAGATTGCAGCATTTTATTCTACATATGCACAAAACATCTGT 

TGAAAAATCTTCTAAATTGATGCAATACATTCGTATCAAGAATACCTGTC 

TGTAATC7CCATAAACCCTCTCCTTTCTGTTTTAAAAAATAGTAACAGCA 

TTTCTCC7TACATGACAAAGAAATGACTTCACCATCTACGAAATAGTGAA 

TAGGAGCT3TGTGGAAGGAAATTAGCTCTACTTCTTGGTGGAGATGAGAA 

3GGAG7GTTCCTCTGAAAATCAAGGCTCTTGTCATGCTAGGAGCCAAAGT 

CGTTTTTTAGAGTGTGGACAGTTGAGAAGATAAGACAGGGACCATCCACT 

CATGTTTTTCTTATTCCATAGGCCTCTCTCAATTGGGCAAAGCACTCCAG 

ACCTTTTGGAAGAGTGACACCAAAGGCAAGCACCTGCTTGGCAGGCCCCT 

CAGCTTCTACGCAAGTATAAGTGAGTATATAAAATGGGGGTACTTGTGCT 

GTTGAGTACCTTATTTCCAAATGAGGCCTGCCGGTGTCCCTG IGGC TGTG 

AGAAGGCCTCTACTGGATAGGTGGAAGTTGTGTGTTCTCATCTTTTCTAA 

CCCTGGATTGACTTGCCCAAAAGGAAGCCATTATTAACACTATAATAAAA 

CCATCCTTAATCTGGGACTCTCTTCATGCAGTGGTTCTTAACCAGTGATA 

AACATGAGAGTTACTTTTGGAGCTTAAAAAAATTAAGATGCTCAAGGTCT 

ACCCAAACTGACTGAATCTCCAGAGGTGAGGCCCAGGGATGTATACTTTT 

GAGCCAGACCTCAGTTTACCCTGCAGAGCTCATAAGGTTGCATAACACCC 

TT r GTCAGCCACTCTGATGAAAAGAAAAATTGGTGAGGAATAAGTTTTAG 

AGAAGAAGGAGCAAAGGTGTTCTTGGCCAGTGAGAGCCAATGACAGGGAA 

ATGCAAACAATGTATCCACAAGAAAGGTAAATTACCCTATAGAGCATTTT 

AGGATAAATGAACATCTCATGCCTAGGGTTGAGAGAGGGTACAAAAAAAA 

AAAAAAAAAAGACCACTCTGGATACACAACGCGATAAATGGAATAAAGAA 

TTTTrrCCTTGTAAATTAAAAAAATCCTTTGTTACTGAGGTA TAATTTAA 

TCTATTTTATGTATAGTTCAATGAGGTGTTATAGA TAATAA A'l"ri"l"l"rTT 

GTAAATT ATTATATTGTCATATACT CAT ACATT CATTTTTAAAAGTCAGA 

AATGTATATAACCATTAAACTTATAAATCATTCAGTCATTCAGAGATATA 

GATACACGAGCATATTTTATATCCACCACAATAATTATTACCATCTCAAC 

AATTCCATCACCCCTCAAATTTCAAGCGTAGGGGTTTTTAAATGTCAAAG 

GAGTCTACTCAGTGGGAAGAAAGTTAAGGAAAAAACCTTTGGGGCTTTGG 

GCTCCTTCCCCCTGGGGTrAAAAAGGCAGGAAATTGGGCTTACCCCCCCT 

GAAATTGGGAACTGAAATTTTGGGAAGTTTAAAAAAAAAAAAA 

'cAAGCAGCCTTCCTTCCTTGGCTTCCCAAATTGTTGGGATTACAGGCAT 
GAGTCAGGATTCCTGGCTTAGTTTACATTTTCTAGAGTTTTGTATAAATG 
GAAACATACAGAATGTATTTTTTTGCGGAGTGGGGGAGTGTTTCTATT7C 
TTTC^TCCATTTTCCCCCCCCCNCCCCCCCGAGACGGAGTCTCGCTCrG 
TCTGTTGCCCAGGCTGGAGTGCAGTGGTGCGATCTCGGCTCACCGCAAGC 
TCCACCTCCCGGGTTCAAGCAATTCTCCTGCCTCAGCCTCCTGAGTAGCT 
GGGATTACAGGCGCCCGCCACCACACCTGK5CTAATTTTTTTTGTATTTTT 
GGTAGAGACGGGGTTTCACCATGTTAGCCAQGATGGTCTCGATCTCCTGA 
CCTCGTGATCTGCCCGCTTCGGCCTCCCTAAGTGCTGGGATTACAGGCGT 
GAGCCACCGTGCCCGGCCCAAGTGTrTCTATTTCTTAACCAGCTTTCATG 
rAATC— TTTTrATTTTACCATCrCTGTGATCCCACTCCCAAA GGTA CTA 
GATGTCGATTGGTCCTTAGGATCAGCTACCATTTGCCCAACTGCTTTCCA 

GCCTTCCAAAAA TTT ' 1 ' 1 1 l ' t 1 1 1 1 1 T TCTTAAAGATACTCCTGTGTGAGG 
CTCAGAACTCTTGAATTGCTACTGCAAATATGAACTCGGTGATGTGAATG 
CCAGGGAATTGCCTGATTGATCAAAGAAATGTATCCCCTTCTCCCTCACT 
CTTGC7GTCTTCTCATTTGTTTTCCCCATCCTTGTGGATTCGTGAATTTA 
AATATCCCTTTAATGTrATAATATTTTAATGGCCTTTGGCGAAAAGTACA 
GAATTAGGTGCAAGAGTGCATAGCTGTTATTTTTTTTTTGGCCTCTGAGA 




r AATGC CATTTCTGGTTTGTACTTCGGTAAGTTCAGATGACCCAATATAT 
^GT^ACATGTGGCATTCAGTAAAAAAGTAGCTTCCCCTCCCTTTCTTCT 
^ c „;;^^-r27rrCCTGCTTC7ATAAAGCATCrGCTTTGGGAAACTTCT 
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7AGGAGGAGAGC77GCCr>JCCCG7GGC_ .vA7GGAGAGG7C77GCAS&GA : 

AAAAGAGA7GC7CCCAC7CAA7GCAGGA7GG7G7GGAGG7AAA7GGGGA7 

ACG7C7GGCA7CAC7CAGGAA7GGGCC77CCTGGCAGGGAAAAAAAGGGA 

GGGGAAAGAGGAAGGGAA77CNNANA7NAA77GC7GAA7ACGGGGA77CC 

ATGGC C7GGA7CCAGGAAGAGAAC777GGGAGG7G7GAAC CTGGAAGGCA 

7CANC7GA7GAGGAGCAGCC7GAAC7CCGGGGAGGACC7G77777GG7GG 

ICCGGAAAAAAATGCCTTCCACACACAGGGAGGCCACCCGGCTGATGGGC 

TGGGGGTTGGACGGACAGCCCTAGGACAGGCrTGGGAAACCAGGCTCAGG 

TAGGGCCTGCGAGGTTCTCGCTGCGTCTCTTTCCTTCCTGGTCTTAGAAA 

ATAGAATCCAAGGCCTCTTGAGAGTGGAAGGTGGGTTGGGAGGAGGGCAG 

A7GGGGC77AGGCCCAGGACACCCG7AGAGC7AC7GCCCAGC7G7C7C7C 

AGGGACTCTGCTGAGGTCACTCCAAGGATCATTCTTAGCCTTGCTAGACA 

GTACTGACAGAGGGAACCGTAGTATCGCACCCACTTCCTTCTCTTTCAAT 

GAAAG777AAAGG7CACCA7T7CC7C7GGCAAAGGAAG77CCAGAAA7A7 

TCCATTTCCGGTCTTAGAAACAGCAAGGTATCAAGCAATTGCAAACTTCC 

TGTGCTGGGGAATTCCCAAGGAAGTAGGGGCAGAGTTCTGGTGGAGACAA 

AG7GAA77CCGAG7GA77AG7CAG7AGCAG7AGCAG7AGCAG7AGCAG7A 

GCAGTAGCAGTAGCAGTAGCAGTAGCAGTAGCAGTAGCAGCAGCAGAACC 

AGAATT7CCCCGCACGTGTCTCAGGCTCTCATTTGCCAACTCAGTCTCTA 

AG7A77777A77GGCAGGAAAAA7AAAA7AGC7A7GAG7GAAA7AA77CA 

T7AGACC7GAGCC7CCA7CAA7777G7G777AAAGGCC7GAC7C7C777A 

CC777CCC7GGGA7GGAAGA7GCAAA7G77CC7GA7C7CAC7G7CAAAAA 

AGAAGAACCAGTGGG7ATATTGTATGC77GAG7TCCAGCCATTAGTCACA 

AGACA7AGAGA7GAC7GCCA7G7G7G7AGAC777C7A7AGAC7G7G7GC7 

AAAC C CG AC C7GC CAC77CCAAGGAG7AGA7GAGGAA7G7 C CA7GG77C7 

GGGGAGCCC7ACCCCAA7T7GGGGCAGACA77CCAAAGC7CA7777C7G7 

GGAGGGGG77GA7GG77AAAGGAACGGC7GGGA777AC7C77C777C7AG 

GGCCAAGAAAA7GACA7GC7GCC7CCA7G777AA7CA7CC77CCCCC7G7 

TAA7AAC7A7GGC777AAG7CCCCGG77AGGGCC77CC7CCAAAA77GGG 

GAAAA AAA77CCCC7CCCCCCC7AAAAA7777777777AAAAAAACC777 

777777GGGGG77GGGAAAAAAACCAAAAA77777777CCCCAGGGG777 

777AA777AAA777C7CCCCAAAAA777G77777777777CCGCGAAAAA 

AAGACCCCCCCAAAAAAAAAAAG777777GGCGGAAAAAAAAA7A77777 

TT7G7GT7AAGAAA7GGAGAAGAAGGGGGGTTTTTTTT77CT7C7CCCCC 

CACCCGCCAAAGGAAAGG77G77CACAGA77G7777G7G7C7CCCGCCCA 

T 

>Conc:.g23 

A7G7GCC7GCGAAA7CA7CC77CCAGAAA7A777GCCCC777C7777G77 
A7AGAG7GGCAC7GC CC7A7A7GG7GACCAC77GC CACA7G7GGC7G77G 
AACAC77GAAA77GGC77G7CAGAA77GCAG7G7AAAG7G7AAAACACA7 
ACCAAA777CAAAGACA7GGCACA7AA7AAAAAA7G7AAAA7A7C7CA77 
AACAAT7777A7A77GAC7G7G7AAG7AACA7777GAA7A7A77GGA77A 
AA7ACA7GGA7GA7GCCCCAACACCCACAG7CCC77A7CAAG7C7C7AC7 
7CACA77777G7AC77C7GAC77AGAAATAGCAC7GGCG7C7AAGAGCC7 
A77AA7G7CG7CAA7AGG7TCTTGGGAACCACAA7777AAACAAAA7GAC 
A7A7AAGAAAACGAA7AACA7TGAACAAAATGACA77A77CGAGGACC7G 
C7GCA7G77G777CAC77AAAG7CAG7G7CCAAGAAAC7A7CAG7GACA7 
T7AG7GAGGAA77GC7G7CC77CC7G777ACAGGAACC7GGGCAAG77AC 
T7AA77CC7C7AAGCCCGGT77A7A7CCC7GCAAAGAGAGAAGGA7AA7A 
A7CACCAG7AC77AG7GA7G7CG7AAGGAGAAAA7AAAA7AA7AAA7A7G 
AAA7GGC7GACAG7G7CC77G7CACACAGAAGA7G7G7GA7CCACAG7AG 
C7GC7A77G7C7GCC7CAC77CAC7AG7AA7GG7CCAGGGAGGCC777AA 
TG7GCA7GG7GCAG7ACA77CACA7G77GGACA7GGG7GAAGGGAAAGAC 
CAGGC7CA7C7AAACACAA7AGGA7GC77G7GG7G7777GAGGAGGAA7C 
AAGGAC7AG77A7CCACAGC7G7AACA7GCA7GGA7CAAAAGAGA7AAGG 
CACACAAAAGAC777G7CAG7AGCAAAGCA77ACAAAA7GCAGAGACCAG 
C7G7GGG7GG7GG7GAG7CAGACCCAGC77CCC7C7G7GCC7GGC7GAG7 
GG77C7GGGCAAG7CACGCCA7C7G7C77GA7GCCC77CCCCA7C7A7AG 
AGAGGGAGCAAC7GAGGCCCC77CCAA7AC7GAAG7CC777A777C7GC7 
AC777AGAAA7A7CCACA77777GG7AAA77CAAA7GA7CCAA7GA77CC 
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ATTTCCTAATGTTCAAAAv. fAGCCCCAv. ^CA7C7AAA7GAA 7CA^A £^ ■ 

AATAAAATATTTATTGTGTATGTTTTGATTGCTGAAAC7TCTATTTTAGC 

AACACACACACACACACACAGAACCCATAAGCCTTCATCTTTCCTTGGAT 

AAACGAGCCTTCCTGTCTGGCCATTTAAGTCACGATTAAGTAAATGATTT 

CCAACTCGCCTTTTGCAGCAGTTCAGATGGGTCTTTCCTGCGTGGCAGTG 

GCCCTCCTGACTTATGATTTCCTGTGTGTCGGCCTGTTACCACTGCAGCT 

TAACTGAGGAAACAAGAACAAAACAGCCTCTGACCCCAAGAGACTGTTGG 

AGGCAAAGGCTTCAGTCCCAAGAACCTCACACGTGGGGAGCCCGAGAGCC 

CAGCCC7GACC7777C7CCAG7AA7AACA7AAGAAACAACAGGCAC7GGC 

C77A7777GGATACAAAGAG7GG7GC7777CC77AAA7C77CC777AG7C 

AGGGG7ACCCCTTCA7GGACGCCCCAACA7CCA7GG77CC7GC77GAG7C 

CC7GC77CCA7A77CCTGCACTTC7CAC77GAAA7A7CCC7GGAG7ACG7 

TAAGCAGCCAGGTTTGGAAGTTCTT G CTGTGCAGGCGGGTGTGTGCATGT 

CCTCTCTCTCAACAGGACACAAGCTCCCCAAATCAGACGGTATGCCTCCA 

"GCCC CT7CCCAAGCCTCCCCAGCAGCACCGAGCATGTGAGGGGAGCTGG 

GGCCCAGGCCATGATGGGAAGCACTCTCTGCCTAAAGACTAGGGTGATGC 

GCCCTCAACTGTGGGAATGAGCCCCAGCTCTGGTGTCTGCCTCGGTTTTT 

CCTCCTGGACAATCAACATGAACTCCTCACCCCTCTTATCCACTTTGCAT 

AAACTGAAAATAACAAACCCAGGGTCTTTCTGTCACAGGAAAGGGTTTTT 

TTTATAAGATTAAACAGAGATGATTCAACACACCCAGGATATAACACAT 

GGGCCATGAG7CAAGGCCAGGCATTGCTCTGGTCAGCCTGTTGTTTGGGC 

CCCCTTGGCAGGGC7CTCCCCTGAATCTTCCCCC7CTTGACTCCCCATCA 

""ACAGCACG7CCAGCTTTGGGTACAAGGCCAG7AAATGGGGAAGGGGGT 

CAGA7GACA7AAAGAGCCC777CC7G7CCCA77GAAA7A7A777GGA7AA 

CAGA7GGCA777CCCCC7G7G7C77GCCCAGGGCCCAGAGCC7CCAC77G 

C7AGAGGCAGACAGAGGATGGAGAGCCCCTTCA77AG7GGGAGGACA7CA 

CAGG7GGGCAAGAAACCACAAGC77GCAC7GAGGCCCAGC C77GAAA7AG 

CAGCACC7GCCGGCACC7GTGGTCTGGGGACAGGG7CACAGGA7GGAGGG 

GCCTCC7AAGCC7TTTATC7C7A7G7AC7AAG7ACAACCCA7777CCCAC 

CTCACAGAGCCAGA7CAGCC7CTG7GAGG7CC7GG7GGCAAAAGGATAAT 

7GCC7GCCCGCC7GCCCGCGG7GGGG7GC77G7GC77GCA77CC7GGGAA 

GG7TGTrGGG77AC7CTGCAA7AGG7C7C7C7GACCAGC7CACCC7CC7A 

CTGCAAACC7CAAACCAAC7TCAAAGAAGA7CCAGCACC 

CGCGTAG7C7AAAGAC7GAGTC7GAAGC7G7CCC77CC7GC7A7GGAC77 

CAGAT777AGCCCAC77GAA77GC7CCA7A7CC7CCAAGCCA7GGCCA7C 

-"TTGAC7C7C7GGGC7CCCAAGCAC77GC7GCC77CA7CACACAG777G 

AG77AAGGCAGAAAGAC7GG777CCA7G7ACAC777G7GGAAGC777C7C 

Arr7C777A7A7AA7C7C7G7CC7TrG7C7AC7GC777AAAA7C7AGAAA 

-"7G777ACAAACACAAAGG7GA7CC777AAAAGC7CA AAGC 7GA77G7G7 

CACCAATATA7ACCAC7C77AA7GGCT7CCCA77AAAC777GAG7AAAGA 

"777A7GGAGCC7ACA7AAGGCCATGACTACC7GGC7CTTAT777CC7CC 

TCA7CC7CA7C7CACGAAC7CAC7C7CCACTCC7ATACCCC7CACrCCTr 

^CCCCTCC7C7C7C7GAGCrCCAGACTCCCAArrACC7ACTTCCACCCTT 

-"TTGACCCCCAGGGAC77A7C7CAGCCTGGAAT77TCCC7CT7TGC7C7C 

CACTGAAC7G7CCACTCCCAG7C7AAGACAfGTGC77A7G7CACACGCCC 

TTACCG7GC77A7CTCAGT7TGTAA77ATCTAC7CA77TAGAAAAG7GTT 

GATGAAGG7C77CACTGTCAGC7TTCAGGATAG CAGGAA7CA7AG C7GA7 

TTTACTTACTTAACGGGG777CAT7CT7TG7AACTTr7777777777GAG 

A7GGAGAC7CACTCT7GCCCAGGCTGGAG7GCAA7GGCA7GA7C7CGGC7 

CLACTGCAACCTCCACCTCCTGGG7TCAAGTGATTCTCC7GC77CAGCC7C 

CCGAG7AGCTGGGAT7ACAGATGCC7GTCACCACGCCCAGC7AMrTT777 

G7A7T7777G7AAAGACGGGGTTTCA7CA7G77GGCCAGGC7GG7CTCGA 

7CTCC7GACC7CAGGCGA7CCACCCACC7CAGCC7CCCAAAG7GC7G7GA 

T7ACAGGCA7GAGCCACGGCACCCAGCCAC7CC777777AC77A7GGG7G 

AGAAGCCA77AGAGA7CA7TrC77C7777CTT7CrC7C77CAC7AAGGCA 

CCAGGG7CAC7AAG7AG7AGGA7AC777GAAC7AGAAC7CAAGAAA77GA 

G7777AA7777ACC7CACAC7C7CA7A7GA^C7CCA7G7GACC7CGGG 

CCA7AC w "CCCC7G7ACCCTGTT7CC7C7777A7AAAAG7AAGAG7TrAA 

AC7AGA7GG7C7CCGACA7GCA7CC77C7C7AACA7A77C7GGAACC77C 
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ATAGCACGTTTGAGGGGGAAAGACCCTAAGGATGATCTTTATAAGCCATC 
ACTTGGTGTTGGTGGTGATAAAAAACTCGAGTATCTTTATGCAGTGGAAA 
GAGAAGATTGGACTCGGAATCAGAAGCTTGAGTTCAAGCACTGGTTTCAT 
CAGTCTTGTGATCTTGGGTTGGTCACTTAACCTCTTCAAGGGTCCTCAGC 
TGTGAAAGAAGATAGTATCAGCTAATTCTTGTATGTGCAGTGAGGAGGCA 
GTGAGATAGTGCAGGTAAACTATAAAACAATTGTCACATGAAACGCATCA 
CAGTGATTCTTTGGACCCACAAGCTCCAATCTTATAAAACATATCCAGTC 
ACCCACCAACATAGATCATCTCACCTTGCATATCTGATTTTGTGGATCAT 
GGGGAAAAACTGCTGATTCCTAGCAAAACCCATGGCATAGGATAAGTGCA 
CAATAATTTTTTTTTCCTAAATGATTTAGATGACAGTGACTCATTAAGGG 
TTTCCTGAGGCCTCCTCAGAGTCGAGAGGTGGGTGCCtGAAGCCACCCAA 
AGTCCCTGTCACAGGATGGCTCCCAACGCACACACCACAGGCCTGCCCAG 
TATGTTCCACTATCTACCCAGTAGAGCCCTGCCCAGTACGTTCCACTGTC 
CCTTCCCTAGAAGAGGTGACTGTTGTTCACAGTCCCAGAAAAGCGGGCTC 
CCCA AAACA ATGCAAGGACCCACCTCTCTCTGAACCTCACCCACCCTAGT 
TTTCCTTTAAAAATCA^TTTACAAGAAGATCATGTGAAGGAAAAGGTTGG 
GTGATATTCTAACCCAAGTTAGCTGTTTCTCAACCAAGTTCTCTTTGAAA 
AATTCAACAACCACCTTTGGGGAATTATTTACAACAGAGGAGTGAGGATG 
GGACCAGGATAGGTATTGCCTATGTTGGTGGAACCAGGGTTTTTTTCCTG 
GATTACCAAAGAGATGGTATGCATTGCTCCCAGAAGCTAAATATCTTCAG 
GCTTTCAATGGTGGCCTTCACCTGAAAATGTTATCCCTGTTGAAGCTTTC 
AAGCCAGTATTTTCATAAGAACTATATTTTCTTTGGTGAACTGAGGCATT 
ATAATGATGACTATACAGGTTCTTGAGTGACTGAAGCCATCATTAGCATT 
3TCATTATTTTTGTTTAGTTGCATCTCCATAGCAGCTCACATTCACAATG 
TGCTTTGCAATTGTTCCTTAGCAATAGCCCTCACAAGATTCTCAGGAGGA 
GAGGGTTAATCCGGATTAACATTTCTGTGAAGCCTAGCGAGATTAATCGC 

>Contig25 

AAGAGTTTTAAAATTAAGTAAGGACGCCGGGAAACAAATCAATCCCAGCA 
AACATTTTGTTGGGATTTATCATTCAAGCAfcTTTTACAGTTATCCCTGTC 
AAATACATTAAGTGTTCAAAATTGGGCATAGGGGGAACAAAATAATAAAC 
CCAGCCAAAACAGAATAATCCCTGTTTGTTCAATGTTGGATAAAAAAGAC 
ATTACTATTGGTGTAA GGAAA TTAGATACATCTTCCATTATTTAGTAAAA 
TTACCATAACTTCTAACTTTGTGGCTTTAGGCAGTCTAGTCCACAGGCAG 
GAAGGAGGTTTGTTTTGGCAAATGACTGTTATGVTCTTCTGTTTCAAAGC 
TAAACCATAAACTAAGT7CCTCCCAAAGTTAATTCAGCATATGCCCAGGA 
ATGAACAAGGACAGCCTGGACGTTAGAAGCAAAATGGAGTCAGGTAGGTC 
AGATCTTCTTCACTGTCTCAGTGATGGCAGTTTCATAACTTTAAATGATG 
GCTATCACAGTTTTCATAAATAATCTAGATAAACAGTTAAAATAAAATAA 
TTAGGTAAATGTAGTGCGATAAATATTAGTAGACAAACTCACCATAATTT 
AGAATCTAAAGTTAAATTAAATAATAATATTTCATTATTTGGTATTTTCC 
AAGAAAAACATATTGTAGGAAACCATTCTTTTTAAAAAAAAAAGTGTCCT 
TTTAAAAAGGTGAATAATTTTTGTCTAATTCAAAGTTTATTGAAAAGTTA 
TGTATAAAACAAGG7AAAAGGAACAAGGAAATAAGGGAAATGTAAAGAAA 
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AT7ATAGAAATAAAGTC»w rATTTTTTGGTAAGAAAGCTTAAAGSfliW^*** 
ATTTTAGGTAAGAAAGAATCTTACCTAAAATTTTGTGCTAGAATAAAGTG 
ACTGGCTAAGAAAGGGATGTTCAAAGCTATTTATGACAAACCCACAGCCA 
ATATCATACTGAATGGGCAAAAGCTGGAAACATTCCCTTTGAGAACTGGC 
ACAAGACAAGGATGTCCTCTCTCACCACTCCTATTCAACATAGTATCGGA 
AGTTCT3GCCAGGGCAATCAAGCAAGAGAAAGAAATAAAGGGTATTCAAA 
TAGGAAGAGAGGAAGTCAAATTTTCTCCGTTTGCAGATGCATGATTGCAT 
ATTTAGAAAACCCCATCATTTCAGCCCCAAAACTCCTTAAGCTGATAAGC 




T CCCATTCACAATTGCTACAAAGAGAATAAAATACCTGGGAATACAACTT 
ACAATGGACATGAAAGACCTTTTCAGGGTGAACTGCAAACCACTGCTCAA 
GGAAATAAGAGAGGAAACAAGCAAATGGAAAAACATTCCATGCTTATGGA 
~, » ^x&Trt iTiTrrrrr.a aaaTrarraTAnTGCCCAAGTAATTTATA 




ATGTTGGGAAAACTGGTTAGCCATATGCTGAAAAU i *> * 

^CCTTACAACTTATACAAAAATCAACTCAAGATGGATTAAAGATTTAAAC 
ATGGCTGGGCATGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCC 




AGTGAAACCCTGTCTCTACTAAAAAAiAi-AAiwvKi. i««<-i.>jv»«w»*ww* 
GGTGGGCGCCTGTAGTCCCAGCTACTTGGGAGGCTGAGGCAGGAGAATGG 
TGTGAAACCAGGAGGTGGAGCTTGCAGGGAGTGGAGATCACGCCACTGCA 
CTCCAGCCTGGGCAACAGAGTAAGACTCCATCTCAAAAAAAAAAAAAAAA 
AAAAAAAGAAGGATTTAAACATAAGACCTAAAACCATAAAAACCATAGAA 
GAAAACCTAGGCAATACaTTCAGGACATAGGCATGAGCAAAGACTTCAT 
GATTAGAACACCAAAAGCAATTGCAACAAAAGCCAATTGACAAATGGGAT 
CTAATTAAACTGAAGAGCTTCTGCACAGCAAAAGAAACTATTGTCAGAGT 
~+ K arrr^rift^TACX^GAAAATTTTTTCAATCTATCCATCTG 



GAACAGGCAACCTACAGAATAGGAGAAAATTTTTTCAATCTATCCATCTG 
ACAAAGGGCTAATATCCAGAATCTACAAGGAATTTAAACAAATTTGCAAG 

aaaaaaaaacccatcaa; aagtgggcaaaagatatgaacagacacatctc 

AGAAGAAGACATTTATGT JGCCAACAAACATGAAAAAAAGCTCATCATCA 




-T-AGTTCAACCATTGTGGAAGACAGTGTGGCAATTC w i uuvw^i u * w« 
ACCAGAAATACCATTTGACCCAGCAATCCCATTACTGGGTATATACCTAA 
, » r.a a. a TraTTrraTTGTAAAGACACATGCACATGTATGTTTATT 



GCAGCACTATTCACAATAGCAAAGACTTGGGAAOMVCCutAAi 
AATGATAGACTGTGTAAAAAAATGTGGACGTATACCCCATGGAATACTAT 
GCAGCQ^TAAAAAAGAATGAGTTCATTCTTTTGCACGGAACTGGATGAAG 
CTGGAAGC CATCATTCTCAGCAAACTAACACAGGAACAGAAAACCAAACA 
CTGCATGTTCTCACTCATAAGTGGGAGTTGAACAATGAGAACACATGGAC 

ACAGGGAGGGGAATCTCACACACCAGGGCCTGTCAGC^ 

GGGGAGGGATAACATTAGGAAAAATACCTAATATAGATGACGGGWAATG 

GGTGCAGCAAACCACCATGGCACATGTACACCTACGTAATAAACCTCCAT 

GTTCTTCACATGTATCCCAGAACGTAAAGTAAAATTTAAAAAAGAAAGAA 

AGAAAGAAAAGGATGTTCACGACAAACCAGAAAGTCCAAGCATGTCATGA 

ATAGTCTGTGTAAGTCACAATAAGAGGATTTATTTAAAAAAACTTTTATA 



TTGGGTTGATGTTAAAAAACTACTTATATATTAftAAAAn^i 
ATGAAAT' , TTCTTACGGGGTTGATT CACT CTTAATAAATTATAAGAGACT 
TAAGAATTrrTTTTTAACCCAAAGTTCAGCTTTTATTGCATCTTGCTGTT 
^TAGGT"'"TCTCTCCCCTTTAAAAGGGTGGGAAATAGTAATGCCCTCCTT 
-AA^-"^CAGCTCATATACG77rrTTACCCTCAGATTCTGTTTGTTG 
~GTCC"3ATGCTAACAATGTTTTCTTAAAGGTCTAAAGGAAATGTTTTC . 
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T Z CAACA7AA7A77C7G 1 uCA TTGCAGAAGGTCTTTTCT^ 
3TAACTGGCT TAAC AGATTTTATGTTTTATTGAAATAATTTC7ATGCCAT 
TATTATTAAGTTTTGGTTTGCTTAGAAAACACTGAGATTAATACAATTT r 
77AAAAA77A7GA77A77ACA7C CA7A7A7C777A7G7A7G7GC7777AA 




— « r- — - - - — » - >->-»— www\ itjuii i u 1 i CiAAACACAGA 

GCCAGAAA77AAAGC7A77CAAC7CAAGGCCCAGGAAC7A7AG7GGAAGA 

GG7GGG7G7G7GAGA77G7AAGGGCCAA7777GAGAGA7AAAA7AAG77C 

AA777C7C7A7AAA77AA7CA7AA7CA77GA7G7CCAAGCCACAC7GA7G 

CAAGA7CAGCA7A7GGG7CC7G7G7CAGA77AACAAGG7777C77GAAGC 

A7TAACC7AC7CCTTAATAAAGG7TATAGAGGT7ATAAAAGGCTTCTGGA 

AG77A7AGC7A7GG7CAAGA7AAAAA777CA7AGA77GT7AA7ACAATrr 

7GGAAAACAAA7TTAA77GGC77C77GC7G77777A77AGGGC77A77G7 

77GGAAAA77AAG7C7CG7C7C7CAAAGAA7GAAGGCTT7CACC777T77 

T77777777777AATCC77GAG77A7CAC777GG7CAAA7GAA7GAC77A 

7TT7ACAA7GACC7T7CATCAAG7G7Tr7AAACC777CAAA777GACAAA 

C777CCAAAA7CAAAC7ACAAA77A7G7C77777A7GACC7AA7GAA7CC 

T77AAAA7AC7AGG77CCC7AAAG7CCAAAAAAAAAAAAAAA7AACA7AA 

7G7GGC77A777GG7A7AAAAA7777ACAAGAAACA77G7CAAA7A7AAA 

A7A77G7G7GG7777G777GGGC7G7A777G7A7AAA7A7G77A77GG7A 

TG7G77CCAAAA77A7AGGAAAC7CC7A7AA77C7GA7A7GAC77GG7G7 

ACA77A7CAG7AA7AA77A7AA77G77A7GG7AAA77A77G7G7GCCA7G 

GAGG7AACAAA777CC7CA7CAAG7G7G7C777GAC7A7GG77GCCC7AA 

AAC77777GCCA77CACAGACAA77G7C77GC777GG7CC7C777AGAAG 

G7GG7777A7AA7CAGC7ATAAAAC7C7AACGGG7GC7C77GAA7GCAGG 

C77AAGATAGC777GGAGAC7G7GACA7CAGAA7AGAGGAAAAAC777CA 

G7A77CA7GGAG7GC7GAAA7A77CA7GAA7A7CAAGCAAAACAGGAA7T 

AAC77CA7AGA7GGAAC7AAAAGAA7GC7GAAG7AA7C77777GAC7777 

777C77AGAA7G77GA7CC77CG7777G77777CAGAG7CNAGGAAA777 

77C7G77GAGA7A77GACAGC777AACAA77AAG7A7AC7CCAG7GAACA 
CAA777GGAGCA 
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CAC7C7CAG7C77GCAC7CACCC7GCCACAC7CAAGGGC77CCCCAGG77 
CC77C77AGA77CCACCGA7AGC7CAGGGAC777GCACA7GC7ACGG7C7 
C7GCC7GGC7CC7CCCCAGA7C77C7CA7GCC7AGC7GC77C7CA7CAGC 
ACCCC7CAGAGAC7G7CCC7GCCCCACC7C7CCAGG77CCA7ACC7GCCA 
CCC7CCCCCAA7CACG7AACAGTTTC77CACAGAGCGAG77ACCATCCCA 
G7ATT7CCC7AAC77ATT7TTTG7GAC7GG7C7G77GCC7G7C7CCACCA 
CAAGAACA7AAGC7GCA7G7GAACAGGAGCC77G7C7A7C77G7CACCCC 
AG7GGC7G7GACA7AACC7GA7ACACA77AGA7GC7CAA7GA7G777GA7 
GAA7GAAG7GC7GG7AGTCCAACTGTGT7TCC7TGTCTGTG7AAG7A7GT 
C7G77G7GG777CCTAAGAACCTACAGCTC7CCGAC7G7GACTCC7GTTC 
TA7GG7CC7GA777GC7GGAC7AGAA7CC7AACC7ACA7GC77AC7C77A 
G7G7CC7CCCCCAGAGGC7GAA7CCCAG7CCC7AAACC7CCACCAAA7GG 
C7AAGACC7AGC77CCAACCAGACAGGCC7ACGC7GAGACC7CAGCACCG 
CCC77C7GCGG7C7CATCCTTAACGCA7CC7TCAGGGCCCAGCrTAAATG 
TC7C77C7CCAAGGAAGGC7A7CC7C777C7GCCCC7CAG7GC7C7CCA7 
GCC7CC7C7A7GCCTCCA7GCCTGC7T7CAACCC7GCAGAAG7GGAGAAA 
T7GC7AA7C7GC7G7G77GACAC7G7GC7GGGG7GCC77GGGCCAGGGAG 
CAGGC7GG7GG7G7GC7GA7AGCCCG7GGC7G7GCCCAGGTCCA7GCTCA 
C77CC7GAGCCCCAG7GGAG7AGGC7CCC777CCC77A77GCAGCAC7CA 
GAGGAAGGACG7GC77CT7AGGACAGA7C7GGCCAACC7C7CCC7CGTGA 
GAGAAGGCCCAGCCA7CC7CT7GCCC7C777C777C7CC7GCCCCCGAG7 
AA7AAAGG7GCC7GG7CAGAGCC77C7AGAAGGAGACCCAAACA7CCACC 
ACACA77CCCAG77CCAACCG7CA7CCACA7GGC7GGC7G7GCAGG7AAA 
CGCAGAG7C7G777CACACACCCAACCA7C7AG7A77GGA7GGGAGGACA 
G7AGCG7GACAC7C77C7CCAGCC77GAGCCC7AC7G7GGGCCCCACCCA 
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ACCCAGATACCAGAG6ruKCCTGTACTGGGATGC7ATTCK»TC 

AGT CATGTACAAAGTTAGCCCTTTGTTATATAGAGTTAGCTACGTACATC 

TTCCTCTGTAGGGAACCCAAGAGGGGAGAAGAGATATGTAGTAGGATTTA 

ACCTGCAAATCCTCTGCTGAGCACCCTGCACTACATACAGTGGGTAGCAT 

3TGG7AGGTGCTCAATAACTATTGACCGATAGATTGAATACAGGTAGGAT 

GGTGACACAATCTAAGATCCCAGGGGTGGGGAGACCACACGCTTGGTTAG 

GGAGACCCAAAGTGGACCGTGTGGCCAGAAGAGTCCCGCACTGCACTCTA 

GTGACAGTGCAGAAAGTCACTGTGGGAAATCTAGAAGTTTCTACAGGTTG 

CTATT'TCATCATAGCACTGTGCAGGCCAACCCTTCCTGCTCCACTGGCTG 

TTGGGAAAAGCTTTCTCTT TT CTTCCTAGCCAGGGAGCTCTCAAAGTGTT 

CCACTCTCTO^CCTCCACCCAGGCGTCCAGGTGTGGAGGACACTTGCCGG 

CTGCTTGTCTGCTGACTCATCCCTTGGTTTCACTTGGAAAACCTACCACC 

AGCTGGCCTCTTTCCAAGCATCAGCCTCCTCATTTTCTTAATCCCTTAGG 

TGTGATCTCACCTCCACACAGTAGATTGCCTCAAGGCCCAATTCCAATAT 

GAATAAAAATGATTATTTTGTCATCrTCCAATCTTCCTTTT AAAATA TTA 

TTTTATAATTCCCTTTAGGAGGATGVCCTAAGTGAAGACTAgrTTTTACCT 

AAGAAATGTTAAAATGTAAAGACATGGTTGTAATCTGGGGATTCCTGTTA 

AAATGGCTAGCAGACAGAAGTCAGACGACAGGCTAGAAATGTGTGAAGAG 

TGGTTGCCrrTGAAAGGCGGAGTTGGTAATGATTTTCTTCCATTTTTCCA 

TGCTTTCCAATTCTCTACAAAGGCCTTAATATTACTTCGATAACCAGGAC 

CTCTGATAACCTGCCCCCACCGAGTAAAGACTTAGCTGGGAAAGTCAGCT 

TCATGTGAGGTAAAAGGAACCAGGTAATACACAATTCCCACTGCCAACTG 

TCGGGTGTGCAGGCCTGAGCTTCCTGCATGTGGGAGGAAAGAGAAAGAAG 

AGAGAAACTCCAAGATCCAAGAGATCCAGCAAGAAGGC7GGAGTCTGAGG 

ACGCAGAAAGCTGAATGGCACAGTTACCACTATTGTGCTGAGGTTCTGTG 

GCCTCTGGGTCTCTTGACAACTGGGCAAAGACCCACAGAAAACTATCTCT 

AGACC CTACCTGTGGGAGGGGAAAGTGCTTAAGATCATTTACAGGACAGC 

CACCTGGACCTCAAATGGCTTACAGTTCCTTCATCCAGAGGGTCTTCATT 

TAGTACATACCAGGTGCTAAGCTGGGTGCTGGAGACATGACGGGGAACCC 

ATTTACCATGGCTTTGTTACTGTGACATTCACATCTAGGGAAAGCCAGCA 

AAGGGGAGGGATCGAGGAGAGCTTGTTAGGCAGAGAAAATACCCAAGGGC 

AAGGGAGAAGCCAGCCTGTTCTGAGCACACACAGTGGTTCCATCTAACTG 

GGCCTCAGTGCCAGGTTGGACTGGAGATGGGGCTGAGGAGCTGTCACAGA 

GCATTCTGGACACAGATGTCACATAGTCCCTTGAGGTTAGGGTCCTTAGG 

CATGGCAGCATTGCTTTGAGTTTTTCCTTTTGTAATGTTGCCATT 

CAATGTGGAAGATGGGTCCTTGCAGAGAAGGGCAGGGCTGTGAGACCAGT 

TAGGAGACTAAGATGTGAGCCAAGGAAAATGAGGAACACCTG AACA CTGG 

GGCAGGTGCAGGGCCCAGAGAGAAGCAGATGGCTTCCTGAGGTTTTAAG7 

^ GGTAGAATCAAGGCAGCTGGTACAGATCTTTTATTACATATAAACTGGA 

ATAAGCCATCTGTTCCAAGACAAAAGAGTAGGCGGAAAACAATACAAGAC 

AGAAATGGAATTAGAACAAACCTGGGAGGAATGTGGAATTAGAGTAGAGA 

GTCCAACACTGGCTGCAATCATAAAAATGTAAAACAAAGVAAAATTTGCT 

AGGTGTGCTTACTTAGAAATAATTAGCTGTCATATTAAGTTCACTTGTGT 

TATGGCTTAAATGTGTCCCCCAAAATGTGATGTGTTGGAAACTTGATCCC 

CAATGCAACAGAGTTGAGAGATGGGACCTTTAAAAGGTGATTAGGTCATA 

AGGGTTCTGCCCTCATAAATGAATTAATACTGTTATCATGAGAGTAGATT 

CCTGATAAAAGGATGATCTCTGCCTCCTCCCCACAGCCCTCTTGTGCATG 

CTTTCCTGCCTTTCCACCTTCTGCTATGGGATGACACAGCAAGAAGGCCC 

TCACCAAATGCAGCTCCTTGATCTTGGACTTTCCAGCCTCCAAAACTGTA 

AGCCAAACAAA TT r CI G ' l T 1 ATTATAAATTACCCAGTCTC AGGTATT CTG 

TTCTAGAAACACAAAATGGACTAAGATCATTAAATTATCATTTTTTATCA 

GACTGTTGA 

>Concig27 J IIJTim - 

AAAATATAACAGAGAGTAAGAGGAAAATTACCTTCTTTCTTTTTCCTTTC 

CCTGCCTGACCTTATTCACCTCCCATCCCAGAGCATCCATTTATTCCATT 

GATCTTTACTGACATCTATTATCTGACCTACACAATACTAGACATTAGGA 

CAATGTGGCCTGCCTCCAAGAAACTCAAATAAGCCAACTGAGATCAGAGA 

GGATTAATCACCTGCCAATGGGCACAAAGCAACAAGCTGGGAGCCAAGTC 

^CAAAATGGGGCCTGCTGCTTCCAGTTCCCCTCTCTCTGCATTGATGTCA 

GCATTATCCTTCGXCCCAGTCCTGTCTCCACTACCACTTTCCCCCTCAAA 
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^^^S^^^ JCTTAGATG -^CTCCACTGATAAGTAGGT^ 
ACTCAATTTGTAAGTATATAATCCAAGACCTTCTATTCCCAAGTAGAATT 
TATGTGCC.GV.CTGTGCTTTTCTACCTGGATCAAGTGATGTCTACAGAG7 
A ^^I A 2EI3"^ T ^ CT ^ TTC AACAAGCATTATTCACTGAG 
AGCC . - GTATTTTTCAGGCATAGTGCCAACAGCAGTGTGGACAGTGGTGC 
^JS^^?^HZ5T A 5™^^^^^^^^^^^^G^ < 3^TATGGAAAA 




AAAGTGGTAAAAACAAAGCCCCTTGTGAGATGAGAGCTGCCGACAGGAGG 
GGGCGGGTCATGGTTGTGGGTTTTTGGGTAGGACATTCAGAGGAGGGGGC 
GGGTCGTGGTTGTGGGTTTTTGGGTAGGACATTCAGAGGAGGGGGCGGGT 
CGTGG7TGTGGGTTTTTGGGTAGGACATTCAGAGGAGGGGGCGGGTCGTG 
GTTGTGGGTTTTTGGGACATTCAAAAGAGTCTGAATGCACCCAGGCCTAC 
AACT7CAAGATGGTAAAGGACAGCTCCAAGGATCAGAAGAAGCATGCTTG 
GAAC7GGGGCA777TGAGAAGGAGGAAAAA7A7GCAGAGAC7AGTGCTTG 




— -.-w«»M>n. wiun y.^ iU ji 1 , iifllA1 x il-wiAAAATAGA 

AAACAGA7CAGAAGGAAGGCAATAGAGAAGCAGAAAG7CCAA7GAGGAGG 
777CACAGCAG7CATGGGGG7GGGGTAAGGAAAAGAAG7GGAAAGAAACA 
GACAGAA77GGG77A7A7777GGAGA7AGAACCAACAGAAGGAAGAGGAG 
AAACAACA777ACTGAGAAGGGAAAAAG7AGGAGAGGAATAGGTr7GGGA 
AA7AAA7CC7GC7GACA77GGAAACCCCAAGGAAGCC7CAAAAG7A7A77 
TAC77GC777AGA777AAAAGAATAGGAAAGAAGCA7C7CAAC77GGAA7 
T7GAAA7C7A7T7TTCCATAAAAGTA77GTTAAAT7C7AC7CA7AC7CAC 
AAGAAAAG7ACA77C7AAAGAGTATA77GAAAGAGTTTACTGA7ATACTT 
AGGAAT777G7G7GTA7GTG7GTGTG7G7A7GCG7G7G7GTG7G777AAC 
CTTCAA77G77GAC7TAAATACTGAGATAAA7G7CA7CTAAA7GCTAAA7 
7GAT77CCCAAAGG7A7GAT7TGTTCAC77GGAGA7CAAAA7G777AGGG 
GGC77AGAA7CACTGTAGTGCTCAGA7T7GATGCAAAATG7C77AGGCC7 
A7G7TGAAGGCAGGACAGAAACAATG777CCC7CC7ACCTGCC7GGATAC 
AGTAAGATACTAGTGTCACTGACAATCT7CATAAC7AATTTAGATCTCTC 
7CCAA7CAAC7AAGGAAATCAAC7CTTA7TAA7AGAC7GGGCCACACATC 
7AC7AGGCA7G7AATAAATGCTTGCTGAATGAACAAA7GAATGAAGAGCC 
7ATAGCA7CA7G7TACAGCCATAGTCC7AAAGTGCTGTTTCrCATGAAGG 
CCAAA7GC7AAGGGA77GAGCTTCAGTCCTTTT7CTAACATCT7G7TCTC 
7AACAGAA77C7C7TC777TCTrCA7AGGAGA7GCC7GAGA7ACCCAAAA 
CCA7CACAGG7AGTGAGACCAACC7CC7C77C77C7GGGAAAC7CACGGC 
AC7AAGAAC7A77TCACA7CAG7TGCCCATCCAAAC77G777A77GCCAC 




AC77G7GCAG7G7TGACAGTTCATA7G7ACCA7G7ACATGAAGAAGC7AA 
A7CC777AC7GT7AGTCATTTGCTGAGCA7GTANTGAGCCT7GTAATTCT 
AAATGAA7GTT7ACACTCTTTGTAAGAGTGGAACCAACAC7AACA7A7AA 
TG77G77A777AAAGAACACCCTATAT7TTGCATAGTACCAATCA7TTTA 
ATTAT7ATTC7TCATAACAA77T7AGGAGGACCAGAGC7AC7GAC7A7GG 
CTACCAAAAAGACTCTACCCATATTACAGATGGGCAAATTAAGGCATAAG 
AAAAC7AAGAAA7ATGCACAATAGCAG7TGAAACAAGAAGCCACAGACCT 
AGGA7T7CA7GATITCATTTCAACTGTTTGCCT7CTACrTTTAAGTTGCT 
GA7GAAC7C7TAATCAAATAGCATAAG7TTCTGGGACCTCAGTrrrATCA 
TTTTCAAAA7GGAGGGAATAATACCTAAGCCTTCCTGCCGCAACAGTTTT 
7TATGC7AATCAGGGAGG7CA7TTTGG7AAAATAC77C7TGAAGCCGAGC 
C7CAAGA7GAAGGCAAAGCACGAAA7G7TA7T7TTTAAT7AT7ATrrATA 
7A7G7A77TATAAATATATTTAAGATAATTATAA7A7ACTA7ATTTATGG 
GAAC CCC7 7CATCCTCTGAGTGTGACCAGGCATCCTCCACAATAGCAGAC 
AGTG77T7C7GGGATAAGTAAGTTTGATTTCATTAATACAGGGCAT77TG 




AA77GG7CCGA7C777GAC7C7TT7GCCA77AAAC77ACC7GGGCAT7CT 
TG77TCA77CAA7TCCACC7GCAA7CAAGTCCTACAAGCTAAAATTAGAT 
GAAC7CAAC777GACAACCA7GAGACCAC7GT7A7CAAAA C 777 C 7TTTC 
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TGGAA7G7AA7CAA7G1 . rCTTCTAGGTTCTAAAAATTGTGATCAGACua 

7AA7G77ACA77A77A7CAACAA7AG7GA77GA7AGAG7G77A7CAG7CA 

TAACTAAATAAAGCTTGCAACAAAATTCTCTGACACATAGTTATTCATTG 

CCT7AA7CA77A7777AC7GCA7GG7AA77AGGGACAAA7GG7AAA7G77 

TACATAAA7AA77G7A777AG7G77AC777A7AAAA7CAAACCAAGA777 

TATATTT7TTTCTCCTCTTTGTTAGCTGCCAGTATGCATAAATGGCATTA 

AGAATGATAATATTTCCGGGTTCACTTAAAGCTCACATTACACATACACA 

AAACATGTGTTCCCATCTTTATACAAACTCACACATACAGAGCTACATTA 

AAAACAACTAATAGGCCAGGCACGGTGGCTCAGACCTGTAATCCCAGCAC 

TTTGGGAGGCCAAGGTGGGAAGATCACTTGAGGTCAGGAGTTCAAGACCA 

GCCTAGGCAACATAGTGAGATCTCATCTCTACAAAAAAAAAATGAAAAAT 

TAAAAAATGAGCTGGACATGGTAGTACACACCTGTAGTCCCAGCTACTCG 

GGAGGCTTGAGGTGGGAGGATCACTTGAGCCTGGGAGATGGAGGCTGCAG 

TGAGCCA7AA7CACACCAT7GCACCCCAACC7GGGCAACAGAG7GAGACC 

CAGTCTCAAAAGATAAATTTTTAAAAATGTTAAAAAATATATAAAAGAGA 

ATTTTAAAAGAACAACTAATAGATCAAAGCATGGATGCAAGATATATTTA 

GTTGGAAAATCAAGGTTAAAATCAAGGGATCTTGGAATTAGGTGTGGTAG 

ATTTGGGTAAGGAGTAGTCTAAGATGACCCTGTTTCTTGGTACTGGAGAC 

TGGATGAGTGGCAGCGTCTTAACCATATTTTTGGTAGAAATATGGAGGTC 

TTCTCCATTCCAGGATGAATGATGAGTAAAATTTTAGGCATGTAATTTGA 

GCTACTAGAAGGACACTCAATTGCAGATG7ACAATGGGGAGATGATAACC 

TATCTGGAACTCAGAAAAATAACTGTATATAGATATGAAAGACATCAGTA 

GG7ATGTAGTAGATAAAATCCTAAAAGTGATGTCAAAGGGAGAAGAGAAG 

TATATGGTGAACACTGTTGTTTGTCCATGCAATTGCCATCTCTTCTTCTT 

CCT7AC7GACAGAACCC7GA777CAC7GAGAAG7CAACA7GCCC77CCCC 

AATTGATGAATCCAATTGGTTGAAGATTATGTTCATTCTATTCTTACATG 

ACTAAGTCACGTTGACTTAATCCTATCAAATGAGATGTCGATCTGGAAAC 

AACTTCTGGAAAAGATTTTCTACCTTGATAAAATAAAGAGCCATATAGAT 

GGTCCTTTATCTTCCTTCTTCCTTGAATGAGATATGTTCTATGAGGAAGT 

GAAGCTTAGAACTGTGGTCAGCAACTTGCAACGACTGGGAAGTCAGAGCC 

ACACAATGAAGAATGCAGAGTGGAAGGAGAAAAAGAGCCAGCATCTCTGA 

CAACATTGTTACACCGAGAACCTACCTCCAGATTTTAAGAAAACAAGAAA 

TGCTACTGTTATTAAGCCATTTCACTGGGTTTGCTATGACTTGCAGTCAA 

ATCTAGCTTAACTGATACAGAGCACCACAGAGAACTGGTCTCTCATTTGT 

CTCATCCTGTTCTTTCTAGCAGCCACGACTTTCCTAGGGTTTCCTTAGCC 

CAAGTC7GGCTAGAGCAAGACTAAGTAAGACTTGATTCCTTAATGTCCT7 

TTGTTTTAAGAAATATTAAAGAATTATTTTTATATTAATATATTTTAAGA 

.\ATAAGGAAATACAAAACACTGAGCAAGCAACACAAATTCAAGAAATC77 

AAAAAG7A7AA7AGC7GC7CAG7C7C7GA77AACAG7GAAA7A7GGAA7C 

ATTGTAGAAATGGCCTTGGAGCGTTATTCTCCCAGGCCAGCTATCCTTAT 

GGTC7GCCCCACCTCCCTCATTGCCTAAACAGTAAGAGAGTCCCATGGTG 

AGAC7CAACAG7CrrAGCACAGAAC77G7TACAG7C7AlT7CT7T7C77A 

CAG7CC7A7A7A7CAATTCCAAA7CAATGAGAGTAAAGCCCAA7CCCTGC 

C777AAACCCAAAGGACAGAAGCCCAAAGCCCAAAGA7A77CCC7AACC7 

7C7CCCCC7 

>Concig28 

CC7G7CGC7CCCTA7GTTTAAAGC7GGGGA7CTC77777CC7G7G7C7AA 
r7A7777CCTCAT7GGCTTGAAAAA7CTGA7AAAACA77T7AGGAC7G7G 
TA7AAAA7AGAATTAGCCAAG7GCAATG7CTT7ATTCAGA AGAAA TTTCA 
TGGACG77GTGCC7ACTCTC7TGGC77CC7GGC77CA7GGCTTTCCAGA7 
CCCACAG7AAGCTCTGGATAGTAGAAGT7A7AGTAAGAC7GACT7C7AAA 
7AAA7GAAG7GAC777AACCT7AC7GA7A7GGC77AAAGAAAAGGAG7GG 
CC777AAGA7CCA7GAACTTC7CAAACAAAAG7GATAACG77A7C7CCA7 
GCA7A7A7AA7AC7AAATATAA7GCAAC7GAGAGAAG7AGGC7G7GG7AA 
GAAAGGAGACCCAAG7GCCA7C7GAAGGCAGCAC7TACCAC7C7GC77CA 
TCCCACCGAGGAAACAAAGCA7GAG7ATrGCCAGAT77TC7 7C7G7 T7GV 
AGAAAAGCCAGAAA7CCAGG77777GCG7GAAA7G7CC7GA7777AA7G7 
-"GGGAAC7AA777A7A77TTGAAA7AACA77G7G7GGGACAAG7GAAC77 
37A7G7GGAAC7GC777C7CCCAG7GGCGACCAG77TGGACCG77GA7AC 
TCAGCAAG77CAGCCAAG7GCGCC77G7CA77G7CAG7CA7CAAGG7GA7 
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AGGAC w . ■ ^ACKrjTTCATTTGCCO^TGCAGATCTTGTAGTCCTGTTTA TfC 
l^~^?l^^ZE^l GCA ^ T ^ AT ^ TG 'nTrArrTTAAGCAGCGAGAGC 
^I^^^ZE^ TCTGGACC ^TCTAATGATCATTTAGTATCAGGC 




^i^Zir^^Xi^^^^^^^^^G^GACTGACAGTCTGTGG 

jTATAAAACTAGTCTAAGGTAGCATCAGGAAGTTCATGAAGCCAAAATGA 
^^^^^^^^ATTTGTTrrTGCCTCCCTCTCATTTTT 

™-;TZ^^^^ gtcttgctctgtcatcca tgctcgtgtgcagt 
ggtgcaatctcggctcactgcaacctccacctccagggttcaagcaattc 
:^^5I5^SS: cc ^ gtagctga: " acaggtctgc accaccccgcc 
■oGuxAGx , "ttgtatttttagtagagatggggttttgtaatgttggccag 
gctgccctgtcattttttttttactagtgtccagtc^gttttttagggg 
ctacataacatgatactgtcattaatctaatggctaatgaaagggatatg 
tatatgtttttgtgtttaaaacaaacttctttggggtcctcaataatttt 
taagagtataaaggggtcctgagatcaaagagtttgagttctgctggact 
gggacagtggttgtcaacccagattgtacattagggtcatctgggaagct 
ttaaaatagtactgatgcccaaccttaccgcaaaccaattaagccagaat 

C7C7G7GGA7GAGAAG7C7TCA77G7CA7CA7CACCA7GACCA7CA7CA7 

tgtcaccgtcactacaccattatcatcatcatcatatcatcttcattatc 

A77G77AG7A7C7CCA7CACCATCA7CAGCATCACCA7TAT7A7CA7CA7 

catcatc-ccaccatcatcctcatcggaacttcacctgcatggaggacaa 
tccactatgcattaggtgctatgctatttgc7atactccttattctcaca 



ACTGCCCAGAGAGGCTGATATTATCTCACTTTATAACAGGAGGAATCTGG 
ATCGGAAAAGTTAAGGTAAGCTAATTCACAGAGCGAGAAGAGATAGAGCC 
AGGA7TCGAAACCAGTTCTCTGCTACATCAATGTTCCCAGTCCTTGCACT 
ATTGA GAACCTCTTTAGTTATGCTTTCACCCCTCCAACACCACAGTAAAT 
TTTTTCTTTTTTTAAAAAAATTATACTTTAAGTTATAGGGTATATGTGCA 
TAATGTGCAGGTTTGTTACATATGTATACATGTGCCATGTTGGTGTGCTG 
CACTCATTAACTCGTCATTTACATTAGGTATATCTTCTAATGCTATCCCT 
CCCCGCTCTCCCCACCCCATGACAGGCCCTGGTGTGTGATGTTCCCCACC 
CTGTGTCCAAGTGTTCTCATTGTTCAGTTCCCACCTATGAGTGAGAACAT 
GTGGTGTTTGGTTTTCTGTCCTTGTGATAGTTTcrTr ana a. ma m/nrr 



GTGGTGTTTGGTTTTCTGTCCTTGTGATAGTTTGCTCAGAATGATGGTTT 

CCAGC77CA7CCACG7CCC7AOUVAGGA7A7GAAC7CA7CC7TT7T7A7G 

GCTGCA7AGTATTCCATGGTGTATGTGTGCCACATTTTCTTAATCCAGTC 

-A7CA773C7GGACATTTGGGTTGGTTCCAAG7C77TGC7AT7G7GAATA 

G7GCCACAG7GAACA7TCA7G7GCA7G7G7C777A7AGCAGCA7GA777A 

7AA7CC777GGG7A7A7ACCCAG7AA7GGGA7GGC7GGG7CAAA7GG7A7 

77C7AG77C7AGA7CCTTGAGGAA7TGCCACAC7G7CTACCACAA7GG7T 

3AA77AG777A7AGCCCCACCAACAGTGTAAAAGCATTCC7ATTTCTCCA 

CA7CC7C7CCAGCACCTGTTGTTTCG7GAC7T7TTAGTGA77GCCATTCT 

AAC7GGCACCACAGTAAATTTTTATAGA7T77A7AAGCAAA77G7ATTTA 

C7G7GCAAGAATTGGTTTATTTTTrAAACCATG7G77GCAAACA7ACAA7 

GG77AA77GTGATATTTGCTCAGTACAAGATCATCAGA7CACTACACAGA 
CTTGAGG7AATTCCACCTAaa anr a a af!&m & r~m.% n^^^t, w» » 




GCC77AA7GTGG77AACTATGTAATrTTT77C7GAC7T7T7GAAATACTG 
AGAAGAGC7CATGACTCTCCCATC7CCTAATTCTACCT7GGTGGATTTTA 
G AC7GAC CACAAC7CATGGGTAAA7GAGGGAAGACGAATAAGAAACCTTG 
C77777777CC7CCTTGTTTTTGGCTGGCTGCAG7GGCTCACACCTGTAA 
7C7CA7CACTT7GGGAGGCCAAGGTGGGAAGA7CACT7GAGCTCAGGATT 
7CAAAAC7GGCC7GGGCAACATAGTGAGACCCCA7CTC7AAAAAAAAAAA 
AAAAAAAAAAAAAGGCGACAGGCGG7GCG7GCC7G7AA7CCTACCTACTC 
AAGAAGCCGAGG7GGAAAGA7CAC77GAGCATGGGAGGTCAAAGCTGCAG 
7GAACC77GA77GCACCACTTCA77CCAGCC7GGG7GACAAAGCAGGACG 
C7GCC7CAAGAAAACAAAAACAAAACC77AA77T7TrGGC7A77C7TT7C 
7GG7AAGAA7GG7A7AGAGATGGGGA7GAGGA7GGC7A7TG7A7GAGAGA 
GCAAACAGGG7CCAAGCAG7GC7C7GGGC7G7C7AAGGACCAG7AG7CAG 
C77AAC77C7CAAA77TCCAGGGAAGGAG77CGGAG7GG7AGAA7ATCC7 



TO. 3 (19 of 52) 



WO 99/06426 PCT/US98/I6102 

GGG7A7GCCCAAAGCA7LaCC77GCAAA7AGCC7G7CA7GAA7AA777Gv 
^TCA777G77A7GAC7GGAAAC7GGC777G7G7A7GCCAGAGAA7GGGGG 
CAGGAAAGAGAGA77GG7G7C77GAGC7C7C7G7GCC7C7GGGGCAG7GA 
-GCT""-"C:CTCTCATGTGGAAGGAGAGCATGAC?GAAAAGGTGCACAAAT 
^GGTGTCTGTGAGAGAAATTAACCTTCCAGATACAGAGACACAACCTTC 
CCCAAGAGG7CC7CA77GC7C7GCC777777CC77777777GC77G77C7 
ACCA7TAA7AACAGAAAC7GA77A7GACC7CAAAAGAGAGGAGAAAGCGA 
C7C7CCCCACCC7AGAGC7AG77AACCACCA7A7C77CC7AGA7A7CC77 

GAGAGCAATGTAACCC 
sConcia29 

GTGAACTCGTTTTACCTGTGTAGCAGACCAAGCCGCAGACAAAATCCNTC 
AGACACCAAATTAAAGAAGGAAGGGCTTTATTGGGCCTGGAGCTGCGGCA 
AGACCACG7C7CCAACAACCGAGC7CCCCGAG7G7GCAA77CC7G7CCC 
X* r 77AAGGGC7CACAAC7C7AAGGCGG7CCACA7GAGAGAG7CG7GA7AG 
ATTGAGCAAGCAGGGGGTATGTGACTGGGGGCTGCATGCACCTGTAGTTA 
GAATGGAACAGAACATGACAGGGATCTTCACAGTGCTTTTCXXATGCAAA 
TAACCGATTAGATCAGGGGTCGATCTTTACCAGGCCCAGGGTGTGTCACC 
GGGC m G7C7GC77G7GGA777CA777C7GCC7777AG77A77AC77C777 




CTTACCTGCGGCiviAlj 1 UAVj(- 1 UftAKU lUUl i.rt*vruj\irvJ i - - - 

CA7CA7CAGGGAAGCAGGAAA7C77GCC77CC77G77GGAAGCAAG7AAA 
AC7CAAAACAAACAAAGAAAAAAACAGGGAG77G7ACAGCAAAA7AAAC7 

T7GA" GACCAAATTTTGGGAGATCAGGAATTCTCTGAAGGAGATGC 

"TTCAGAC'"'* , CAGCAAATTGTCCTGTTGGTTTGAGCCATAAAGTTAGCTC 

ATGC"' r: GTACCAAACACCAGTAGGAGATT7GTCAAAGGTAAGAGGCATCT 

CCACTCAGAATCCCTTCGTGGTTACCAACATGTGAACCTTGGAAATCTGA 

GACAGGTCTCAGTTAATTTAGAAAGTTTATTTTGCCACGGTTGAGGACAC 

CCACCCA7GACAGAGCA7CAGGAGG7CC7GACCACA7G7GC7CAGGG7GG 

TCTGAGCACAGCTTGGTTTTACACATTTTAGGGAGACATGAGACATCAGT 

GAATATATGTAAGATGTACACTGGTTCCCTCCAGAAAGGCAGAACAACTT 

GAAGCAGGGAGGGAGCT7CCAGGTCACAGG7AGG7GAGAGACAAACAA77 

GCATTCTTCTGAGTGTCTGATTAGCCTTTCCAAAGGAGGCAATCAGATAT 

GCATTTATCACAGTGAGCAGAGGGGTGACTTTGAATAGAATGGGAGGCAG 

GTTTGCCC7AAGCAGTTCCCAGCTTGACTTTTCCCTTTAGCTTAGTGATT 

TGGAGGCCCCAAGATTTATTTrCCTTCTACATCACTGTGGGCAGCTGACT 




^C^CAACC' , 'CC'"CTAGCAGCCCCAGTGGAGATACAGAGGAAGCAGACTA 
GCGA7ACAACCCAGCC7GAAG7777G7C7GG7GAG7G7AA7GGAA7AAAA 
ATGGGAAGGGTGCTGAAGAGACCAGCAAGAAAATGGTTGAAGAGATGGGG 




AGAGAGAGGGGATAXt-TAivatjOi. awv»v_vj«.a a w twnnrw™**™. ~" 
GG7GC777GAGAAGAGAGAGGG7GAGAAAGCAGGAAGGC7GGAGGC7G7C 
ATCCAAGAGGCGGACATCTGTGAACATGATTCCAAGAGTCACCAGACCAT 
GGGGGTGGCCAAAGGGAGTGCCTCTTCTCACCTCCTACTCTT^TTCCTT 
GTACTCAAGATAATAAGTTCCCAGAAGAGAAG7ACCCATATTTAATTCAT 
CTGTGTC7TCC7AGCAGTACTAAAAATATTATATGAAAGGTATCAAACCT 
TTG^GAATGTGTGCTGCTAAATTGTTAAGGATGCTGGAAAACTCAAGACG 
TCCCTGATCCTGAGCCTGaGTATGAGCCTGTGGTGAGCCCAATGCAGGTC 
. .«-~^«^ 1 -*&r«/"&TO&fl&rrT&GGGACAGAGA7 



TCTGAGAGGTAGGTCTTGATGTCCACATTTTGAACATGAGGACATCCAGC 




CCAGCCCAT7T37GAAG7GCATACTATAGGTAAGT7GGCACAGGAGGAG* 
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ATT - CAC«AAG« i A AATTCCTTGAAATCCAGTTATCTACAGGACAATGT^ 




^?E5Z^^^ GGTGttGGAGCCCGACGG AGATGTGGTCCAGCAGAGA 
•^A^iriZ^^^^^^C-CTGACGAAGGATATATGCTGAT 



AGATAGCTGGACACATGAAGGTTCCTGGAGGGTGGTGCCCCAGAGGGGCA 

tggaagctcc^caccccttctcaovtgctttgctcixk:gcatctcttcat 

cttaagtgttttcctgagttctgtgagctgctctagcaaattcacggaac 
ccgagggaagcaaacccagatttatagccatcagtcagaagcataggtga 
caacctaccacttgtaactggcacctgaagtgggaggcagtcttgtgaga 
H g ^S cct ^ cctgtgggatct ^ cgcta actccaggtagatagtgtt 
ggagtgaattaggacacccaactggtgtcggctgctggaggactagtggt 
g^gaaatccccaagcatttcggtgactagaggtcacagaagaactcag 

4 G ZI G ^ GG7OTGTGA ^GTATGGTAGGGAAAACTGCGTCTGGTTTTTTC 

ct^acaa tcagtta aatatttaacacaagtctactgtatattagtaaa 

AGGG .TACATTTTTTAATGTCTTGACAGTTGCACTTTGACAACTTCCATA 

T c ^TS ACTTTT ^ CGTGTCCGTTTCG AAcauu^ 

ATGAACCAGGCTGCAGCGTATTCCCCAGGCCTTGAAAGCTTGGAGGCCAT 
TTTGCCAGCCOTAATCCCTGTGAATACCAGGCTTCGTGGATTTAAAAAAT 
AGACTTGAGGCCAGGCCTGGTGGCTCACACCTGTAAGCCCAGCACTTTGG 

gaggcagaggcggatagatcacaaggttaggagttcgagaccagcgtggc 
caacatggtgaaaccccgtctctactaaatatacaaaaaaaaattagccg 
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GGCGTGATGTTACACGwJAGTAGTGCCAGATACTCAGGAGGCTGAGGCrtG 

3AGAAATACTTGAACCTGGGAGGCAGAGGTTGAAATGAGTCAAGATCGTG 

CCACTGCACTCCAGCTTGGGCGACAGAGTGAGACTCAGTTTTCAGGGGAG 

rTAAAACAATACAAAAAAAGAAAAAGACTTGAACAATGAGGCTCCACTGG 

ATGGATTTAGGGGAATTACAGGAAGCAGGACCTGACGGTGCAATGCCACA 

CTCCACCTGTCCAGAATTGGACCTCACCAAGGGAGGTCTGTGGGGACAGG 

3AGAGGCCCTCTGCCTCCACCCCCTCCTCTACTCCCCAAACCCTGAG7CA 

GSCTGAATGTAGTAAACCTGGAACAGAAAAGTTCAGTTTGGCAATAGGTA 

TCTGAAGGACTCCAGGTGCTTCTCCCTTGATTCAAAATTTTACTTATAAA 

AAAAATTATAAGAAAATTCTACTTAAAAGAAATAATCAGGGAGGTACAAC 

AAATTGTACTTTTTTTTTTT TT ' r ' I ' r TTTTTTTTGAAATGGAGTCTCACTG 

7TGCCCATGCTGGAGTACAGTAGTGTGATCTCGGCTCACTGCAACCTCCG 

CCTCCTAGGTTCAAGTGATTTTCCTACTTCAGCCTCCGAAGTAGCTGCGA 

TTACAGGTGTGTGCCACCACACCCGGCTAATTTTTGTATTTTTGGTAGAG 

ACGGGG7TTCACCATGTTAACCAAGATGGTCTCGAACTCCTGACCTCAGG 

TGACCCACCTGCCTCAGACTCCCAAAGTGTTGGGATTACAGGGGTGAGCC 

ACTAAGCCCAGCCATTGTACATATTTTGTGGGTATTTACTAAAACATTAT 

TCAAAATAGTAAAAAAAAATTGAAATAAACTGGGGACTGGTTAAATAATT 

TTGGGTACAACCACATGATGGAATACTATACAGCCATTAAAAATTACATT 

GAGGCCAGGTGTGGTGGCTCATGCTTGTAATCTTAGCACTTTGGGAGGCC 

AAAGTGGGAGGATTGCTTGGACCCAGGAGCTCAAGACCAGCTTGGGCAAT 

GTGGCAAAACCCTGTCTCTAAAAAAAAAATACAAAAAAAATTAAAAAGCT 

GGGTGTGGAGGCACACACCTCTAGTCC CAGC7ACTCAAAGGGCTAAGGTG 

GGAAGA7CAC7TGAACCGGGGAGGTCAAGGC7GCAGTGACCCAAAATCGG 

GTCATTGCACTCCAGCCTGGGCAACAAAGCAAGACCCTGTCTCAA AAAAA 

AAAAAAATACATTGAAGAATATCTTACGGTATGGATAAATATTCATTTTA 

CAGTGATAGATGCAAATAAAAGCAAATTACAAAATATACAGTTTAATTCC 

AACTTTGATACTACATATGTATATATGAATACATGCATATGTTATGTATG 

TATATGTAAATATAACAATATATGTTCTATATATGGATATTATATATTTA 

G^CATACATACACACATATATAATATCTTCTCTAGAGAGCAGAAAGAGAG 

TAGACAGATAATGAAGATAGGATACAACTCCAGTCCAGCTCAACCTAGGG 

GACTTGTTTTAAAGCCTCAGGAGAGAGAAGTTGGGACTAGAAAGCAAGGC 

AGCTATTTGTAAGCATCTTTGTGTTTCATGCTATTGGGGTGGGAAACAAC 

AGCAGAACTTTTGAAAGCCCCTTTCTACTCACCCCAO^AACTGCAGAGCA 

GCTTTAGGACCCTCAGAGTTCAAGAAGACCATTTGCAGAGTAGAAGAAGT 

AAAAACATGTATGAACTTGACCCTGAGCTCATGGACTGTGCCATGAGGGA 

AATTCCTAAAACAGCAGGAGAGGCCCTGGAGGAAGGCAGAGGCCCTGCAT 

ZAGCAAGTCCAGGCAAAAGCCTGCATTCCATAGATGCTCATCTCTCTGGC 

TGGTGAGGTCTAAAGACGTTTGGTCTCAATATTAAGTCTCGTGAGAGAGG 

TCACAAACCCAGTCCCTTGGCCACAAAAGGAAATAAAT TCTGG CTTGAGA 

CATTAGGGAGGAACAGGGCAAGGGGAGGTTCAAGAAAGTTTTAATGGATG 

AGATGATATTTAAGCAAGGCCCTGGAAAATGAGAATTTCAACCAATAGCC 

ATATGGTAGGTCAGAAAGCAAAGATAAGGAGGGGGCAAGTGCAAGGGGCA 

ACATCAGATATGACCAGGGTGTCGTGGGGCATGGCTGATGGAGAAGAAGA 

TTAGACTGGAGTTTGGGAATGCCACAGTATCGAGGTTGGATTTAATCCTA 

TGGGTAATAAAGCCAACTGTTCAACCCCCAACCCACTTGCAATATGGCTC 

CAAAATAGCAGGTGTTTGATAAAATGACTACTTTTACTCTACTATTCCCT 

CCCTCTTAAGAAGAAAAAGAAAGTGGAGGCTCAGAGAAAGGCAGTGGCTT 

3TCCCAATCACACTATGATTTGGCCACAAAACAAGAACGAAATGTTACAC 

CCAAAAATGCTGCCTCCACCTCCCTTC CTlXjCr ' lT CCTCCC TGCTGG ACT 

ACAGACTATCTCAAGAGTGACGTACACCATCAGGGCTTCAGCTTTTCCCC 

GAAACAATGCCAAAATATTAGCCATACGTCACTGTAGTAAGAGCCCTGAA 

rTGGGAATCCCAGCTTTGACGCAGACATGCTGATTGACTCTGTGACCATT 

CTCrrCACTTCTCCACTCTATTCTTCCCCACCTGTAAAGTGAGGTCCTTT 

C CAGTTATAAAAACAGATGATGCTATTGTCCTGTTTTGTATCTAATCTTG 

CTGTGTTATAAAAAAAAAATAAGGCTCTGTACATTCATCTTGGCCAATTC 

CCTTCTTATCTCTACTTCCCACAGCCCCTTTTTCTACAGAAAACCAGCAT 

"GTTCTTCTGGATCCATCTCTTAAGAAAGCGCTTTGCCTCCCCGGTTATT 

^AGGTGATAAGAAGTGTCCTAGATGACAGCCCTGGAATGGGCTGGAGGCA 

ACAAAAAAGCAAGTGAAATAGACAG7TACAGCGACGACAATAATAACAAC 
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S^^A-^^^^^GAAATAAAAAAGAAAATTAAAATCTGC 

AC^^^^?S^I^ G ^GTCTACAGCACTTGGTT 

2S^I™?^^ATOAGCCTTCATAGCAGCTCACAGC^TCK5A 
;;™--™?^^CTGTCA^ 

ATG liw.ji >aAATAAATGACCCTCTTTTTAGATGAGGAAATCGAGrrTri 
AGGAGAACAAGCAATGTAATGTCCCCCTCCWTTCAGCCATC^C^^T^ 

tagtcggcttgttcccccac^c^^ 

(3AAACTTCTCAGGCTCX^AC3C3GGAATTTTA^^ 

StScacat^ cagcatgtaaatatcattgagccgtagtccagacc 

>Concig31 



C,^r i ^^ TCTC ^ GTCAAAGATA TGAGTCTGGAGCAGCACATCC 

taagtcacctcctck^ccaacacagaacttccaggccactcacSgagot 
ctcccaaatagtttccaagtgtcattatgttaataaccStgag^^gaa 
caccagattcaaaccccactgcatggcttttaaagaccatctcaa^g^c^ 



CCCCACTCCCA^CCCOTA^CcaC^iA^CC^^^C^ 

agagacaatccagaactcttgctcactcacagcta^ggSctggg 
ack^tggctgtgtccatoxsaacct^^ 

tggtcacccacctgtctccctggcaga^^ 

tctaccagggctaaccggcctgctcactctccccagStgtS 

cccactctctaattattacattcccttcacataaactccccSt^ 

aatcaccac*tgttcacttcc^ 



ua> ^w»aaaaaacaaacaaacaaacaaacaaacaagcaaaaa 

^S^ TGAT ^ A "^^^ C "^ T TCCAGGAACATGCTTATC^ 
. CTAAC . -TCACAACAACTACAGCAGGTAGGTGTTATCAr A rrr a Trrr-r 



~Z ^XZ^ * ACw 4 ^^ACCACCAAATCCAGTGGCCTCAGGCCTGGCTG 
^ACAC . -wATCACCTGGTGCCCAGACCACATCTTAGACCAGTCATACAG 
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ACATATATTTTTTAAGAGTAAATACAAATGGTTAGGAAACATTTTTTAAA 

ATGCCCAACCTCATTAAAAATTATAGAAGTGAAAATTAAGCCACAATAAG 

ATACGATTT7ATACCAAATACAGTGTGAACACT7TGCAAGTCTGACCTCA 

CCAAGTC-TTACCAGACGTGTGCACTGACGTGGCTGCTGAGATACTGATGG 

TGGGTC7G7AAATCTGTACTACAAACAATTGCAATAAAATGTAATAAATA 

7ACAA7AGG7GGAGCAGGAAG7GACCTGCAACCA7A7AGCAGA7AGGGCA 

GGAAAAAGCC7A7GAAAGC7GACA7CAAAGGGA7AAG77CCAG77ACCCA 

GCTGAAGGGAAGGAGGGTGTTTCAGATAGAGGAAGGATAAGCATGACCTA 

T7CAAGGCCAG7GAAAGAAGCG7GCAACGGCCAAG7CAGGAGAACC7GAA 

ATTGTGTCAAAGAGCTTGGATGCAAAGAGCCGTGGGAGACTATTGGGGGT 

TTTAAGCAGGGATATAATATTCATTCAAGCATGCAGTAAAAGGTCACTGG 

CACCTGCCATGGGCCAGGACTCGGGCTCTACATGATTGCGTCTGTTTTGG 

AAATATCACCCTGGCTGTGAGATGAAGAACAGGTAGGAGGGTCACAAAAC 

TTGAAGCAGAGAGACTGTTGAGGAAGTAAGCTGTTTTTGTGTGGACTGTG 

GCAATCACAGAGGCAGAGGATATAAATGCACAGAGACACAAGGCATGTGG 

GAGGCAGAAGGAATCAAATACAATGAGTGATCAGATGTGGGGTTAGAGTG 

GTGAGTGAGAAGACATACT»AGGTGACACGCCCAGGTATCTGGGTGGAT 

GGTAAGACATTCATGGACTAGGATCGAGGAANGAGGTGGGGAATGGGACC 

ATACCTGCAGTTTATAAGGGGTGGACGAGGGAAGATTATGCGGGAGACTG 

AGAGAGGAA7AGACAAAGGAA7CCCGGTGCAG7A77ACAGAAAC7GGGG7 

GGGAGGGGGTTGTANTTCAAAAAGGAAAGAAAATTGTCAAATAGTATGAA 

ATGCTGCAGAGAAACTCACGG A 'l " l "r TTTTTTTAAGC7TAGAATTATTCAT 

7GAC7A7G7GAA7AAGAA7AAC77T7A7GAAAGAAG77T7GC77AAG7AG 

-AGGAAGAAGCAAAATTGTTGAGGGCTGATGAGTGGGAGGAGAAGTAATT 

GAAGGCAC7C777CAAGAGAAACAAAGCAGAAGG7GAGGAGAA7AC7AA7 

GAAGGAG7TACGGCCTTCACTATTTTGTTTTGCTTTAGATAAGCAAGACT 

TGAGTGGGTCTGGTGAGGAGAAACAAGTAGAGTACAAAGTTAAAGGAGAG 

ACAGACAGAGATAGAGATAGGGACAGAGAGAGAGACAGAGACAGAGCACA 

ATGTATTGCAATTCAATTCCAGTACTAACCACCCAGAGTTTGTGTAGACT 

CTACAAGTTAAAGAGCATGGTCCCCAACAAGACTGCTTCTACGTCAGATG 

CCAGGCACACTTCAGGGGTCCCCAAGCCACTCATGTTTTTTGAATGACTG 

CCATAAGTTCAAAAATTCCCACAATTCTCTCAGATTCAATAACT GGGTA T 

AACCACTCATAGAACTCAAGAAAATGCTATCATTATTATTACAATTTTAT 

TATAAAGGATACAAATCAGAAGGACTAGCCAAATGAGGAGACACATAGAG 

AGAGGACTAGTAAAAAACAGAGCTTCTGCGTCCTACCTTCAAGGAATCAG 

GATGCACCACCCTCCCAGCACATCAAGTGCTCATCAACCAGGAAGTTCCT 

■-""GAGCTCGAATGTCCAGAGATTTTAGGGAGGATTCATTACATAGGTATC 

ATTGATTAAATCATTGGCCATGTACTTGAACTCAATCTCCAGTGTCCCTC 

-TC7CCCTAGAGG7CTGAAGGGTTGGCTAATATCATGTGGCTCAAAGCCC 

CAACTCTAATTACCTTTTTGGTCTTTTCAGGGACTAGACCCCATCCTGAA 

GCTATCTACAGGCCCTGCCATGAGTTAGCTCATTAACATAACAAAGACAC 

TTATATTACTCAGAAAATTCCAACAGTTTTAGAAGCTCCATGTCAGGAAC 

CTGGGACATAGATCAAATT ClTm ' T ' I I ' ll"! 1 1 1 1 1 1 1 1 G GAGACAGGGT 

CTTGCTGTGTTGCCCAGGCTAGAGTGCAAGSACAGATCACAGCrCAATGC 

AGCTTCAACTTCCCAGGCTTAAGTGACCrrTCGACCTTAACCTTCC AAGT 

ATCTGGGACCACAGAAAATGGCTAATTATCCTGGCTGATTTTTAAACTTT 

'ITTTTTTTGTAGGGATGGGATCGCCCTGTGTTGCCAAGGTTGGTCTCAAA 

CTCCTGGGTTCAAGCAATCATTCTGCCCTGGCCTCTGTGA TGGTTA ATAC 

TGAGTGTCAACTTGATTGGATTGAAGGATACAAAGTATTATTTTTGGGTG 

7G7C7G7GAGGG7GTTGCCAAAGGAGATTACA7TTGAG7CAG7GGACTGG 

GAAAGTCCACCCTTTCCCAGTGGACTGGGAGACCCACCCTCAATCCAGGT 

AAACACAATCTAATCAGCTGCCAGTGTGGTCAGAATAAAAGGAGGCAGAA 

GAACAGGGAAACACTAGACTGGCTTAGTCTTCCAGCCTACATCTTTCTCT 

CATGCTGAATGCTTCCTACCCTCGAACATCAGCCTCCAAGTTCTTCAGTT 

""77GGAC7CT7GGACCTTCAACCACAGATTGAAGAC7GCAGTG77GGC7T 

CCC^GT^TTGAGGTTTTGGGACTCAGACTGGCTTCCTTGCTCCTCAGCT 

"GCAGATGGCCAATTGTGGGACTTTAACTTGTGATCATGTGAGTCAATAT 

•"CC-TAATAAACTCAGATATATATATATGTATCAGACATATATATATATC 

CTATTGTA7ATTATATACAGATATATAATATCCTATTATATACAGATATA 
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-ATCw-.ATTGuTTCTATCCCTCTTGAGAATCCTGACTAATACAGCCT'"' 




AGCTrTTAAAAAGAAAATGAGTCTCTGGTCCCAGGTTTCATCTGAATTCT 

>Concig32 

AA^GC^TACGAATGAGGAAGAATTAAGGGCCAGAACAAAACAAGAAGA 
. vjAG^vjAAAGTTTGGAACTTCTTAGAGACTGGCTAAATGGTTGTGACCAA 
AATGCTGATAGTGATACGGACAATGAAGTCCAGGGTGACAAAGTC-'CAGA 

tggaaatggggaatttgttgggaactgggcaaaggtcacccttgctatga 

CTCAGCAAAGAAATTGGGTGCATTGTGTrrAT(rprrTr:ftrtrti'r/~.^.^/-« 



" " "~ ~ w '""v">«iuni v*/u_ i iAULnj iAWiGXATCTAGTGGAAGAAA 

CCTCTAAGCAACAAAGTGTGTTGCTTAGAAAT^CTTTC'~rTCTTTTTTT 

TTTTTTTTTGAGCTGGAGTTTTGCTGTGTCGCCCAGGCTGGAGCGCAGTG 

GCGCAATCTTGGCTCACTTCAAGCTCTGTCTCCTGGGTTCATGCCATTCT 

CCTGCCTCAGCCTCCCAAGTAGCTGGGACTACAGGCGCCTGCCACCATAC 

CTGGCTAATTTTTTAGTATTTTAGTAGAGACGAGGTTTCACCATG^AGC 

CAAGATGGTCTCAATCTTCTGACCTCGTGATCCACCCGCCTTGGCCT-- " 
AAAATGCTGGGGTTACAAGCA.Tnanrrarrrf'rsor'r^^r^T.^^^w..^ 




_™ w w,vj U n i«.iwv_i_i,*ji,v_ivjv_i 1 1 i AACAGCCTGTGCTCAGGG 

SI^^ T ^ CT7; ^" GG ^ Cc rATGTTTAAAATGGAAGTAGAG7 

, i AAAAATTTGGAAAATTTGCAGCCTGGCCTTGTGGCAGAGAAAGAATC'" 

AAGTAGGCTGCAGAGCAATCATTGCTAGAGAGATTAGCATGACTAAAAGG 

GAGC CAAGTGCTAATATTCAAGACAATG7TAAAAAGGCCTTGAGGGCATT 

rc^GAGATCTATGAAGCAGCCCCTCCCATCACAGGTGCAGAGGTTTGGTG 

CACTAGGCCCAGAGGTTTTATGGGCCANNGCCAGGGCCACACTGCTATGC 

ACAGCTTTGGGACACTGCTGCCCGCATCCAGGCCACTCTGCTCTGGC7CC 

ACCCTTGGCTCAAACGGGCCAAGATAGAGCTTGGACCACTGCTCCCGAGG 

GCACAAGCCATAAGCCTTGGTGGTTTCCATGTGGTGTTAAGCCTGCAGGT 
GCCCAGAATGCAAGATTGA£a3GAt3CTTflrtrtrir«rTrrK^r«f»» » 




™_^»,^ W4nwl A \»\»w»usjw*Ui\AVjl^Tt^TAl-AGGGGC 

AGAGCCriTGCAGAGAACCTCTACTAGGGCAATGCCAAAGGAAAATGTGG 

GGTTGGAGTCCTCACACATGGTCCCCACTGGGGCACTACCTGGTGATACT 

GTGG<3AATGGGGCTGCTGCCCTCCAGACCCCAGAATGGTAGATGCACTGG 

CAGCTGGCACCCTGAGCCTGGAAAAGCTGCAGGCACTCAACTCCAACCCA 
TGAGATCAGCCACATGGGCTa.OTPPranrtr5& iflrrrir»n.«w« 




— — - .m-jwiu^^ i«\.wv.v.i AV3iiKV.LAUC i iUtJivjeACATGGAA 

TCAAAGATTATG7TGCAGCTTTAAGGCTTAATGTTTTCCCTGTCAATTTC 
AGGCTTGTGTGGGACCT G T T GCI 1' TI T 1 1 1 1 1 U 1 T T 1 1"1 r TT T TTT GGT 




AAAArGGTGATATGGTTTGGCTCTATGTCCCCACCCAAATCrCATCTCAA 
ATTATAATTCCCATAATCCCCACATGTTGAGGGGAGGACCTGGTTGGAGG 
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TGATTGGATTATGGAGGCAATTTCCZCCATGCTGTTCTGGTGATACTGAG 

TGAGTTCTZATAAGATCTAATGGTTTTATAAGTGTTTGGAAGTTCCTCCT 

ACACACATGCTCACACTCTCTCCTGCAGCTTTATGAAGAAGGTACTTGCT 

TTCCTTTCTGCCATGATTGTAAGTTTCCTGAGGCTTCCCAGCTATGCAGA 

ACTGTGAGTCAATTAAACCCGTTTTCTTTATACATTACCAGTCTTGGGCA 

3TTCTTTACAGCAGTGTGAGAACTGCTGGCGATGAGAGTGACCTCTGGTT 

GI^CTCACTGCTCATTATATGCTAATTATAATGTATTAGCATGCCAAAAG 

ACACTCCCACCATGACCCCAACAGTCATGCCTGTGCCGGTCTCAGCACCA 

TGACAGTTTACAGATGGCATAGCAACGTCTAAAAGGTACCCCATATGGAC 

TAACAAGGGGAGGAACCCTCAGCTCTGGGAAGTGCCTACCTCGTTCCCAG 

AAAGCTTGTGAATAATCCAC7GCTTGTTTAACATATAATTAAGAAATAAC 

TATTAAGCATCCTTAGTTCAGCAGCCCAAGCTGCTGTTCTGCCTATGGAG 

TAGCCATTCTTTATTCCGTTACTTTCTTAATAAAATTGCTTTTACTTTAC 

TGTATGTACTCGCCTGGAATTCTTTCT7GTACGAGGTCCAGAGCCCTCTC 

TTGGGTCTGGATCGGGACCCCTTTCTGGTAACATTTTGACCAATTTCTCC 

CTTCTGGAATGGGAATGTTTACACAATGACTGTATCACTTTKAATCTTG 

GAAGTAAATAATTTGTTTTTGACTTTACAGCCTCATAGGTGGAAGGAACT 

TGACTTGAATTTCAGATGAGACTTTGGACTTTGGGACTTTTGG GTTGG GG 

CTGGAATGAGTTAAAAGTTGGGGGGATTATTGGGAAGGCACGATTTTATT 

TTGCAATATGAGAAGCACATGAGATTTGGGGGACCAAGGGTGGAATAATA 

TGGTTTGGATGTTTGCCCCCTCCAAATCTCACATTGAAATGTAATCCCCA 

GTGTTGAAGTGAGGCCTGCTGGAAAATGTTTGGATTACAAGGCTGTCGAG 

CACAT7GGATAAGACGTGTAGGNCCC 

>Concic33 

CGCAGCTCGC7GGTTAATTCTGTGGCTCC7GTGACCACTATTATAGCACC 

AGGTCTATGACCAGGAGAATTAGACTGGCATTAAATCAGAATAAGAGATT 

TTGCACCTGCAATAGACCTTATGACACCTAACCAACCCCATTATTTACAA 

TTAAACAGGAACAGAGGGAATACTTTATCCAACTCACACAAGCTGCTTTC 

CTCCCAGATCCATGCTTTTTTGCGTTTATTATTTTTTAGAGATGGGGGCT 

TCACTATGTTGCCCACACTGGACTAAAACTCTGGGCCTCAAGTGATTGTC 

CTGCCTGVGCCTCCTGAATAGCTGGGACTACAGGGGCATGCCATCACACC 

TAGTTCATTTCCTCTATTTAAAATATACATGGCTTAAACTCCAACTGGGA 

ACCCAAAACATTCATTTGCTAAGAGTCTGGTGTTCTACCACCTGAACTAG 

GCTGGCCACAGGAATTATAAAAGCTGAGAAATTCTTTAATAATAGTAACC 

AGGCAACACCATTGAAGGCTCATATGTAAAAATCCATGCCTTCCTTTCTC 

CCAATCTCCATTCCCAAACTTAGCCACTGGCTTCTGGCTGAGGCCTTACG 

CATACCTCCCGGGGCTTGCACACACCTTCTTCTACAGAAGACACACCTTG 

GGCATATCCTACAGAAGACCAGGCTTCTCTCTGGTCCTTGGTAGAGGGCT 

ACTTTACTGTAACAGGGCCAGGGTGGAGAATTCTCTCCTGAAGCTCCATC 

"^CTCTATAGGAAATGTGTTGACAATATTCAGAAGAGTAGGAGGATCAAG 

ACTTCTTTGTGCTCAAATACCACTGTTCTCTTCTCTACCCTGCCCTAACC 

AGGAGCTTGTCACCCCAAACTCTGAGGTGATTTATGCCTTAATCAAGCAA 

ACTTCCCTCTTCAGAAAAGATGGCTCATTTTCCCTCAAAAGTTGCCAGGA 

GCTGCCAAGTATTCTGCCAATTCACCCTGGAGCACAATCAACAAATTCAG 

CCAGAACACAACTACAGCTACTATTAGAACTATTATTATTAATAAATTCC 

T CTCCAAATCTAGCCCCTTGACTTCGGATTTCACGATTTCTCCCTTCCTC 

CTAGAAACTTGATAAGTTTCCCGCGCTTCCCTTTTTCTAAGACTACATGT 

TTGTCATCTTATAAAGCAAAGGGGTGAATAAATGAACCAAA TCAA TAACT 

^CTGGAATATCTGCAAACAACAATAATATCAGCTATGCCATCTTTCACTA 

TTTTAGCCAGTATCGAGTTGAATGAACATAGAAAAATACAAAACTGAATT 

r TTCCCTGTAAATTCCCCGTTTTGACGACGCACTTGTAGCCACGTAGCCA 

CGCCTACTTAAGACAATTACAAAAGGCGAAGAAGACTGACTCAGGCTTAA 

GCTGCCAGCCAGAGAGGGAGTCATTTCATTGGCGTTTGAGTOVGCAAAGG 

TATTGTCCTCACATCTCTGGCTATTAAAGTATTTTCTGTTGTTGTTTTTC 

TCTTTGGCTGTTTTCTCTCACATTGCCTTCTCTAAAGCTACAGCCTCTCC 

TTTCTTTTCTTGTCCCTCCCTGGTTTGGTATGTGACCTAGAATTACAGTC 

AGATTTCAGAAAATGATTCTCTCATTTTGCTGATAAGGACTGATTCGTTT 

T ACTGAGGGACGGCAGAACTAGTT7CCTATGAGGGCATGGGTGAATACAA 

^^GAGGCTTCTCATGGGAGGGAATCTCTACTATCCAAAATTATTAGGAGA 

AAAT^GAAAATTTCCAACTCTGTCTCTCTCTTACCTCTGTGTAAGGCAAA 
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rACCTTATTCTTGTGGTG 1 fTrrTGTAACCTCTTCAAACTTTCATTGATTS 
.^TGCZTGTTCTCG^TACATTAGGTTGGOCACATAAC^TAC CAAcX 
:^^^J7£TA^GAAGTTTACGATCTAATAAAGGAGACAGGTA 
^^S^^S^^ C ^ TOAGCTAGAAG ATCGAGAAAATGC?GAATGT 




: - ; -r~^ . . ^v. i ; j at< j <jtgg;;tggtg 

: — i r : ~ : : ^ a ^S7^^ TGGCCAAAG77ccaga catgtttgaaga 
v. - ^ u«AGAAC - -ttacaggtaaggaataagatttatctcttgtgatttaa 

^I^^^^^^^^ g ^ gaggttgtaaagatg ~ g tactagtcctgaagtc 

«GAGwvGo i. - -AGAGAAGACCCAGAAAAACTaAGCATTCAGCATGTTAAA 

ctgagattacattggcagggagaccgccattttagaaaaatta^tttga 
ggtctgctgagccctacatgaatatcagcatcaacttagacacagcctct 
gttgagatcacatgccctgatataagaatgggttttactggtccattctc 
aggaaaacttgatctc^ttcaggaacaggaaatggctccacagcaagctg 
ggcatgtgaactcacatatgcaggcaaatctcactcagatgtagaagaaa 

GGTAAATGAACACAAAGATAAAATTACGG AACAT ATTa a & rp & irsiv^ t 




rgcctggatattrractaagtataaattatgaaatctgttttagtgaata 
catgaaagtaatgtgtaacatataatctatttggttaaaataaaaaggaa 
GT ^SI:H^ CC ^ CTTTTCTCTAAAGGAGCT7 
ACT.-ftAi . aaagctcttcaatttgttagccaagtccaatttttacagat 
aaagcacaggtaaagctcaaagcctgtcttgatgactactaattccagat 
tagtaagatatgaattactctacctatgtgtatgtgtagaagtccttaaa 
tttcaaagatgacagtaatggccatgtgtatgtgtgtgacccacaactat 

CATGG7CA7TAAAGTACATTGGCCAGAGACCACACTGAAATAACAACAAT 
TACATTCrCATCATCTTATTTTGACAGTGAAAATGAAGAAGACAGTTCCT 




*«w*w»« j. uui l i_Hi LUAAAUAliAACAGCAA 

CCTGTCTTTCATTATAAGTGAGACCAGCTGCCTCTCTAAACTAATAGTTG 
ATGTGCATTGGCTTCTCCCAGAACAGAGCAGAACTATCCCAAATCCCTGA 



' - -'-™->--" * a ^urto^^uvjA>\ucaAUuv.AGTTGAAAGTGAGAA 

rCTACAGCCACTCATCAATCTGTGTTATTGTGTTTGGAGACCACAAATA 
GACACTATAAGTACTGCCTAGTATGTCTTCAGTACTGGCTrrAAAAGCTG 
TC C - -AAAGGAG7AT7TCTAAAATATTTTGAGCATTGTTAAGCAGATTTT 



GGAAGGTTCTGAAGAAGAGACGGTTGAGTTTAAGCCAATCCATCACTGAT 
GATGACCTGGAGGCC ATCGC CAATGACTCAGAGGAAGGTAAGGGGTCAAG 
CACAATAATATCTTTCTTTTACAGTTTTAAGCAAGTAGGGACAGTAGAAT 
TTAGGGGAAAATTAAACGTGGAGTCAGAATAACAAGAAGACAACCAAGCA 
TTAGTCTGGTAACTATACAGAGGAAAATTAATTTTTATCCTTCTCCAGGA 
GGGAGAAATGAGCAGTGGCCTGAATCGAGAATACTTGCTCACAGCCATTA 
TTTCTTAGCCATATTGTAAAGGTCGTGTGACTTTTAGCCTTTCAGGAGAA 
AGCAGTAATAAGACCACTTACGAGCTATG7TCCTCTCATACTAACTATGC 
CTCCTTGGTCATGTTACATAATCTTTTCGTGATTCAGTTTCCTCTACTGT 
AAAATGGAGATAATCAGAATCCCCCACTCATTGGATTGTTGTAAAGATTA 
AGAGTCTCAGGCTTTACAGACTGAGCTAGCTGGGCCCTCCTGACTGTTAT 
AAAGATTAAATGAGTCAACATCCCCTAACTTCTGGACTAGAATAATGTCT 
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•jGTACAAAGTAAGCACCv-AATAAATGTTAGCTATTACTATCATTATTAwr 

A77A7777A777777777777GAGA7GGAG7C7CAC7C7G77GCCCAGGC 

-GGAG7GCAG7GGCGCAA7C77GGC7CAC7GCAAGC7C7GCC7CC7GGG7 

-CACGCCATTTTCCrrGCCTCAGCCTCCCGAGTAGCTGGGACAACAGGCAT 

GTGCCACCATGCCCAGCTAATTTTTTTG7ATTTTTAGTAGAGATGGGGTT 

-CACTGTGTTAGCCAGGATGGTCTCTATTTCCTGATCTCATGATCCGCCT 

GCC773GCC7CCCAAAG7GC7GGGA77ACAGGCG7GAGCCACCGCGCCCG 

G<?C7A77A77A77A77A7TAC7ACTAC7AC7ACC7A7A7GAA7AC7ACCA 

GCAATACTAATTTATTAATGACTGGATTATGTCTAAACCTCACAAGAATC 

C7ACC77C7CA7777ACA7AAAAGGAAACTAAGC7CATTGAGA7AGG7AA 

ACTGCCCAATGGCATACATCTGTAAGTGGGAGAGCCTCAAATCTAATTCA 

GTTCTACCTGAGTAAAAAAATCATGGTTTCTCCTCCATCCCTTTACTGTA 

CAAGCCTCCACATGAACTATAAACCCAATATTCCTGTTTTTAAGATAATA 

CCTAAGCAATAACGCATGTTCACCTAGAAGGTTTTAAAATGTAACACAAT 

ATAAGAAAATAAAAATCACTCATATCGTCAGTGAGAGTTTACTACTGCCA 

GCACTATGG7ATGTTTCCTTAAAATCTTTGCTATACACATACC7ACATGT 

GAACAAATATG7CTAACATCAAGACCACACTATTTACAACTTTATATCCA 

GCTTTT CTGACT7AGCAA7G7A77GATGACA77ATGCA7GC77AGAC C7C 

C 

3Conrig34 

G7A77C7A77C7CGG77ATAACACAA7CACAG7GA777G7CA7ATC7x7C 

CAGGA7T7G77AA777CAC77C7TCAGC7G777CCCCC77G77GGC7GGA 

AC7GA7777C7A7C77C7GGGAGAA7C77CAGCAAGCCAAC7CAGGA777 

G77GGG7GCA7777G7CAAG7CTAGGACCCAGGC7C7GGG7GAC7GAT77 

CC7C7AA77ACCGAGCAA7G7AAAA7GAGGAAG7C7GAT7G7G7AAAGG7 

G77AAAC7777G7G7GACGGCAAAACTr7AA7ACCA7GAA7AGAGA7TCC 

AGAAT777CCAAC77C7AACGGGA77CC7T7CACTCCCTGACA7TAGAAT 

G77AGAAAA7C7ACCACAAAACATC7G7GAGGC7A7CC7ACAAGGCCCGT 

7TT7CAAAA7AGGT777TACAAGGA7TGCTATTTGGGATGATAGTTTCAG 

AAAGGCGCTA7CAAAGTTAATTGATGA7G7G7GCAAGCTGAAAGTTATAT 

GTTAGAAC7AGCAGTGATTTCAAAAA7ATCCCT77TAGGC TT7TT GCTAA 

7A7A7C7GC7CA77T7CAAAG7TCCCAATAT7ATAAAAC7TrT7AAAGCA 

GAAAGAAGAACCCTCCATTTCTGCTGGCCCC77CCC7GTTCAACTAAAAA 




7GGGA7C7G77A777C7C7CCA777C7GC7GC7GCA7GG7AG7CCAAG7C 

^ c ~-r~c~~~-CCCC7AGGCCA777GAA7CA7C7GC7AAT7GG7TT7CC 

-GATTGCCACGGAAAC77CC7CCA7CCC77CC7CACA7A7CAGCCACAGA 

AG7A7""""'AAAAAGCAAA7C7GGTGACA7GAAGCCCr7GCACAAAACCC 

ArrCA77AC7GG77CCACACC7CC7rrG7GGATAAG77CAAGC7CC7GAG 

7G7GGCAAGCAGGGCCCACC7GGAA7CCCC7GCCC7CC7C7CC7ATCCCA 

CGCATCAATCTT7CCTG7C7A777GCAGT7CC77GAA7GTGATATTCTT7 

C7AG7CTC7G7GC777TGCATAACCTGT7CTTCCTGAC7GGAAACTCC7T 

C7CC7 r CT7GTAG777GGC7AA777C7AG7C777CAAGAC7CAGC7CA7G 

CTTCACCCCC7CTATAACAAGTCCTTTCCCAAGCTGGG7GGTGGATGCTC 

C7CrG7GC7G7GTGAGTCTTGAACATCC7CAGCAAACC7CAGCT7TGTT7 

GCT7G7C7CCC77GC7GTCAATGCACCTGA7TCAGGGCTGGCATATACTG 

^7CACC7CCA7GACTGGCTCATGGTGG7GCrCCGTGAATATCATCCACCC 

AAACGGA7GAGAGCTACCATGCCATCACT7GTGACTrCCATCTGGAGCTA 

ACC7CCCCCGACAGGAAAGCGTTTCC7TAGGAAAGAATATCTTTGGGTTA 

AATAGAAG7AGAGAC7CACCAGAAGCAC7ATGTCCAGC7CAGAATGAAC7 

GC7CAG7AAGCAGCC77G7CAATGAGGAGGCAGCAGGCCAGCCCCAGAGG 

CC7CAAAG7GGGAGAG7AGAGAAGCGCAGTTCCTGCCACAAAGGCACAGT 

GGACACC77GC7CCCC7GGCTGGCTGGAAGCAGATGGTGTCCACCTGCTT 

CCA7GGGAA77C7GCACCTTTAATAAAGTT7TA7GGGACAGGAAGG7GAC 

7GGCA77GACA77G7AACGAGGAA7GGG7GG7GCCACCT7TGC7G7G7CT 

""ACCAGAAA7ACC7GTGGCAGG7AAATT7C7AGAGAGACCC7CCCA77TC 
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S^I^^ G ^^I ATG ^ G7GCTGtGG ^ GG TGTATAAACCAACTGC 
^gE^H^^CTTGGGGCAAATGCGACCTCTAGGACT 



*x aavj^caggtAGGATTGCATGTGCCTGTAGTC 

rGAGGCAGGAGGATTCCCTGAGACCAGGAATTT • UAUGCTG 

CAGTGAGCTATTAAGTTGGCGCAAAAGTAATCGTGGTT7TTATCATTAAA 
AGTAATGGCAAAACTTTTAATGACAAAAACCGTGATTACTTTTGCACGIA 




^ * w * ^.wiyrmn i./v\AMMMx AHAiAAAAi . GTAAACATGTAAAAAAA 
ACCCCAAAAACAAAAAAAATGGGTGTTGAGACCCCTGAATT5AGGAATAA 
TAGGAAGGAGTGTGATTCTGTGTGTGCATGCATGGGTGTGCACCCTCAGT 




AAGATACCTATGGACTAGAGTCCCTCCTCAGAGGAAAGGCTCCTCCCATT 
TCTCTGGCTTTCAGGTAGTAGTCCATGACTTCAACAGGTCCCCAGTGCAA 
TGTTATGGG7TAGTTTAGGTGGGGTCTCCTCTGAGAGCCTCCCATAGCCC 
AAAAGGCCCTGTCCTAGCTGGCACTGCATCTCCCTCTTCCCAGCTC"CAG 

^^^^^^^^tf^%^^^^^^^Mff^^^^+^%f*p%^^qt PUMA A* _ 

GATG" 




* — "»«wwywvj i/vjwwaAAi l^i I^AGCTUWjGAGTTCCAG 
GCTGTAG7GAACCATGATTGCACCATTGCATTCCAGCCTGTGTGACACAG 
CGAGACCCTGTCTTTTTrCTTTTTTTTTTTGAGACAGGGTCTCGCTCTGT 
CATCC AGGCTAGA GTGCAGCGGTGTTTTTCTGCTCACTGCAGCCTCAACC 
TGCACATTTTTTGTAGAGACGGTGTCTTGCTATGTTGCCCAGAGTGGCCT 
CAAACTCCTGGGCTCAAGAGATCTTTCCACCTCAGCCTTCCAAAGTGCTG 
GGACTACAGGCGTGAGCTACCGCGCCCAACAAAGACCCTGTCTTAAAAAG 
AAAACAAAAATAAACAACTCCCTCAAGT C II I 1 1 1 T ' lTTTTTT GAGACGG 
AGTCTCGCTCTGTCGCCCAGGCTGGAGGGCAGTGGCGCAATCTTGGCTCA 
CTGCAAGCTCTGCCTCCCGGGTTCACGCCATTCTCTTGCCTCAGCCTCCC 
^^^GCTGGGACTACAGGTGCCCGCCACCACGCCTGGCTAATATTrTGT 
Ai . . -TAG7AGAGATGGGG7TTCACTGCGTTAGCCAGGATGG7CTTGATC 
~-C7CACC77G7GA7CCGCCCGCC7CGGCC7CCCAAAG7GC7GGGA77AC 
AGGCA7GAGCCACCGCGCCCAGCCAGACC7C77GAG7C77AAAC7CC7C7 
G7AG77CCAGCCACCC7TTAGCACATGAC7CTGTTAATTTTG77C7CAC7 




CA7GCCAG7G7GGA7GATTaAAA7TG7TGaG7GGaGGC7GA7CAGATGAG 
CCA7C7CC77CCAAGTCCTCACTTGCTGGC7CC7G7C77AG7777AGTCC 
CCA77C77CAAAGAACG7GAGCCCTGGAAAG7A777TAG7CA7T7AGrrC 
AG7GCC777GGA7GGGAGGATCACA7CCCTGGGTCCCG7CC7GCAGAC7G 




— _ — » iwiv»w»A x iuui i LAAv^UAAi UALuTTCCCCA 

ACA7AA7CC7ACTCCACAGGGAC7TAAAGG7G7G7CAGAGA7C7C77GC7 

CA7C777C7GGCCAGGTGCCAACGTCAGTT7ATAGCCAAGGGACAAGAC7 

AG77AGCAGA7CAGGCAGG7CTrAGACCCCAGCG7AAG7GCCAGAC7TC7 

AGC7GCAG77G77CC7GCCCACACTGGGCG77CAGG7GGAGAGAGGGCA7 

GGCAC7ACAC7GAGC 7C7CG GCGAAACCCAGGAC7C7GAAA7C7CGG7G7 

CAGCCACAGGCCAC7CTTrrCAGCAGGACTTCAG7CAG7CC7G7CAC7AG 

GC7G7CGAGCACA7GG7AGGC777ACCCC 

>Concig3S 

AAGGAG7G7GC77GCTGA7AGCA7G7G7GANGGGACGAGGAG7AAA7AA7 
T7C7GCC77CAAGAAA77GCAAAC7AG7AA7GGAGA7AAAA7CAACAGAG 
3AACAA77AGAG7A7AAGG7AAAA7C7AAGGGCCA7AAGAGAGGAGAAGA 
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AGTATGGGAGTTCAGAGu rAGGGGGTAAATGAGGGGAGTAGG7GGGTA(jA 

.AAAGGTTAAAAGTAAATAATGATGGGAAGGAAGACAAAAAGACGACAGGG 

GTGCCAAAGGACTCTTAACCTCATCTGAACGGAGTTGCCCTGTTTTGCTC 

T C7G ATG CT CATGT AT CTATCCTTAGAGACAG CTTGGCGGGCAATGTAGA 

GC GTAGGGGCTGACATAGGGGGTTGGAGT C CCACCT CCGTGACTTCTAGC 

AAATTAGCAAACTTTGCTGCTGCTAAGCCTATAAGGCGGACAGAAATGCC 

ATCTTTAAAGCTTGTTATGTAAAGTGCCTAGGACCTCGTAGGCATCAACA 

GGAATAATGGATGAAACAAAACAACGGTGCGTATCTTGGAGAAAGTGGCA 

TCTGAGCAGGAGTATTTTGAAAGGTAGGAAAGGGCTCCAAGCACATCTAA 

GAGATTAGGGAACGCAGAAGCCTTAGCCCTGGGTGCAGATTTAACCAATC 

AACTTC7AACCACCGCAGGCTGAGAGGTGTGGAGTGAGAGCCCCGCCAGA 

GGCAGGAGACCCGGGCTTCGGCCAGACCCCGCCTCCTGGTACAGAGGACC 

ACGCCCGGCTCTGCCTGGAGCCAAATGTGGATCAAAACAGCGCGCAGCTT 

CCCACTGCTGGTGAAAACCCGAGCAAGGGGCCTCAGTTTCTTTATCCGGA 

ACGTGGTGACAATGACATCTCTTTGCAAGGCTGCTGCAGGGCTTTCTGGA 

AATACGCCCGTGAGGTATCTGGGCCTGCGCACAGCCTCCCCCGCCCAGGA 

CCCAGACGTCTACCTGGGGGTCCCGTCTGCGCTCCCGGGATGGAAAACGC 

CCAGGGGAAACTTAGGCAGGCGAGCGGACGGGCACCTCCCGCGGGACGAA 

CTCACTCGGTGGCCTCCTACTTCCCCGGCCGTGTTCCAACGCCTGAGAAT 

AACGGGAACAGCGGTCGTACTCACCGACAGCGGCAGCAGCGGTAGGCCCG 

GGCCCCACCATGACTCTTCAGTGACAGTTTTTCTTCAAACGCCGCSCCTG 

TAGCCAGGACCGGCGTGCCGCGCGTCCACGCGTCCTCATTGGCTCCTGCG 

GGTTTGAAACTCGCTAGTCGTCAGCACGGGAGGGCGGGACAACAGGCAAT 

AGGCTCTTTGCGGTTGGCTCTGGCCTTGAGAACCCGACCTTGGGGCCCTT 

TGATTGGAAGAACGTGCAGCGCACCTCGGCATTGAGGGCGGCTTCCTCGG 

GGCGCGGCGCCGCCCGCCTCTGAGTGCGCCTGTGAGTGCGCCTCCGAGTG 

GGCGTGGGACCCTCCGTGGGGGCCTCAGCCGGGCTGGTGGT7GGGGGGCG 

GTTACGCTGAATCCAGCTGGGGTTGGCGCGCCGGGAGTCCCTGGGCGGAG 

AGACAGGGCGGTCCTCCCAGGATGCTGGGGCCGCTACCTGATTCTGTCCT 

TTCAAAGTCTCAGACTCACAGGAGCTGTGAAAAAATAATATTATAAAGAG 

GACATATGGGTCTTATGCATCTAAAGGCTCCTAGTTCTTAGTACTGCAGG 

GTGGCTCGTTTAATTGTGGTAAAATATGCATAACATCACATATACCATTT 

TAACCATTTTAAAGTGTTAAATTTTTCAAAAATGTGCAGTTTAGTGGTAT 

TAAGTACCCTCACATTGTGGCACAGCCACCACTACTGTCCTTTCCAGAAC 

TTTTTCATCTTCCCAAATGAAACCCTGTACCCGTCACTAACTCCGCACTC 

"TCCCTCCCCCAGCCCCAGGCAATCACCATTCTAGTTTCTGTCTCTATGG 

ATTTGACAACTGTAGGTGCCATATAAGTAGAATCATGCAGTATTTGTTCT 

3TGACTGGCTTGTTTCACTTAGCATAAAGTATTCAAGGTTCATCCATGTG 

"AGCATGTGTCAGAATTTCCTTTCCTTTTAAGGGGGAATAGCATTTCGTT 

3TGTGGAGATGCCACATTTTGCTTCTTGGTCCATCCCTCTCCGGACACTT 

GAGTTGCTTCCACTTTTTGGCTATTGTGAATAATAATATGAACATGAATG 

CACAAATAACTCTTTGAGACTCTCCTTTTCATTCTTTTGGG TATATA CCA 

CGAAGTGGTATTGTTGGATCAAACGGCAATTCTATTTTTAATTT TTTG AG 

AAACTGCCTTACTCCTCTCACGGTGATCTCTTGTTCAAGGTATATTTTCG 

ATTTCACCTGATCAGCTGACTATAAGGCCATftAGGCTAACGGAGAAACGC 

AGGCCTAGTTTCTCCTAGTTACTAGGAGATCGCAGGCCTC GTTGTCCTG A 

ATCCCTAGACAGACTTCATTCCCCTTGTTTTAATCCTAAATTTTTTTTCT 

TTTGAAGTTTGTCCTGTTTCATCTATTCTCCAGTTTCTTAAAGAGGTCTG 

GAAAATGCTTTTGGCTCCTTGTGTATGAAGGTTCCTCTTCCATGGATGCT 

GGAGAAGTCGTGTGTGGAGGGGCAGTCATATCTGGGCACCTGTTGGCCAG 

GTTCAGCTTACCAGTTGGGTACTCAGCAGGGCATGAAGCCACTGCAGCAG 

CCCTTCTCTTTAGCCGTAAATAGGGAGTTTGGAAGAGAGCCAGGGTTTCT 

GGATTTATGCATTITGATATTTTCAATAGTGTATTAAATGTTTAAAATAG 

3AAAACTGATCATTATTTTTGTTAATGACTGAGAAAGGGACTCCTTCACC 

AACAGTTTCAGAAAAGTGAAGGCGGTTTTGTTTTGGTCTTTGTAGAATCT 

AGGTGGTTGAATGCATGTCAGTTGTAGAAGTCACCTTGCCTGATATCCCA 

-GCAGTGCTGGAGTATTCCACAGACCCCATGTAGGTACTGCACCTTTGCA 

GGTATACTGCTGGTGTTGGTGAGCTGCCTTACCTGTCCTGTTATTGGAGA 

-r--^-^CTTATTAGGAAACTTAAAATGAACTCAAATGAGCTTCCT TGCTT 

ACTGGTCGTAGTCCTTTGGAGCAACATAGGCCAGTTCTGCCTCGTTTTTT 
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^^ C - - A 4 C^CrrCTACTCAGGAGGCTGAGGTGGGAGGATC^rT-r 




vjvjTT. .Gv-wATGTTGCCCCAGGTTGGTrrTr,narT>— t^^^, » 



CCACs. «. AGCCTTAGCC CACAC7AC7CTC CC CTTGCTCAGTGTTAT 

CCAGAC ACTTT GTTTTTTCCTTTCCATACTC CTCTCTGTCTGGGAATCCA 

^^^-^^^^^^TACGATTCTCCCACTTACAGTGTAC 
AATT^nAi . . *-7AACATTT7CA7CACCCCC7AAAGAAACC^ATACTCA 

TTAGCAGTCACTCCGCATTCTCCCCTCCTCTCA^cr^AGAAACCATGA 

AAA* .aivxAAi J.TGTGGTCTC7GATGGGCTTCTTTTGTTACCAAAATAT 
CATGGGTTTGATCTAGGTCCTGCTGCTCGCTGCACAGAAAGCCAGCCAC 
GAGATGACAAGTATTGCCAAGGAAGAAGGCTTTAGTCAGGTGCTGCAGC^ 
^AGG/\GA , GGGGGCTCAATCTCAAATCCATCTCGCTGACCTAAAACCAGS 

GGTTTGGATAGCAGGGAAGAAATGTAACAATGCGTAAGAAAACAGGAACC 
AG ^ GGGGCAAGGAAG <^TCCT^^ 

TGCCTGGATGTGGTGATCTGGCGAGTTTCAGTTCTTTGATACTTTTTTTG 
AGAGGCCTGAAGTCTTTTCCCCAGGAAGGAACTCAAACAAAACAA^ACA 
AGCTTCCAGCTTTAAGACCAGAAGCGTCAATTTCTATGTTTATCCGAAAG 
AACAGTCTATGGGACTATTGGTTAAGTTTCACT^CAC^TAGTATGCTGT 

TGGvjiA* . CTATTGTGCGGATATACAATATTTTATTTGCCATTCATCAGT 
; G ^^? A E A I£ TAGG ^^ TC ^^' ZTT TGGCTATTATGAATAATGCTG 

. -AT^rtA' CATGTATAAGTTTTTGTGTAGACATATG7TTTCAACAC'" 

Z A I GGGTATA7ACCTAATGAGATCAAT ^ACTGTG7CATACGATAATT!r'A 
C ^ * 'r^E^ - TGAGGAACTGCCA ^ 

A7T77ACA* . CC7ACCAGCAG7GTA7GAAAGT^CCAGT77CTTTACATC' , 

^ GAACACTTG ^ A ^ G TCCATCT77TAAAmCAACCATCCT^ 

T7G7GAAA7GG7ATCACATTGTGGTTTT7ATTrG7ATTTCCTTGATGAC7 

AA7GA7G77AAGCATCTTTTTATGTG77TAC7GGCCATTTGTATATC7C* r 

A77CAGAG7C7TrGCCAATTTT7AAA7TGGGTCAGT7GTCTrCTTCCTTT 
JT7777GAGATG<3AGCCT^^ 

^ GTC ^ G SI CACTGCAAOTCGA CC7CC7C?rGTTCA^^ 

CC7CAGCC7CCCAAGTAGCTGGGAmCACGCACCTGC»CCAT7CCCAG 
^I AAT P TTTTCTTTG ^ TTOGAG TAGAGACGGGG7TT 




— w.^.i iU i vjw»i inuv. i naKi lAi i i^vITATCTTTC 

-CCC7GC7AG7C7GTAAACTGAGGG7AGGCCAC7ATA7TCA7TG7TC7TG 

GCACCAAATAGAAAC7AAATTAA7G7C7777GAA7GAATAGGGCT7TC7C 

C77TrAAAGA7CCCT7CAATACAG7AACCACAC7ATATATAAG7AGCCAC 

AAGCCCA77CAA7AATACTAC7AG7NC77GCGCCAAACC 
>Concig36 

3GC7CAGCG77AC7A7AC7GG7C7CAAAC7CC7GGGC7CAAGCGA7C-GC 
CCCCC7CGGC77CCCAAAGTGT7GGGA77A7AGGCG7GAGCCACGG7GCC 
7GGCC7CAAA7AACTATTTAAG7GAAACAAAACTAG7ATGGCAC7AA7GA 
AAAA7G7A7AAA7CCA7AA7CGCAGAGGGA777CAAC r ~AC77'"7*~"CGA 
77A7G7AAAGG7CAAACAGACAAAAGACAA7GACAAAACTTAA7GCAA7G 
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AACAC7T773A777AA7GAACA7A7A77GGA7A7G7AC C CAAGAA77AGA 

-.AA7ACA7AC7AG7777GAG777A7GCAGAACA777ACAAAAA777AG7G 

3AAGCC7AAA77A7AAAAAG77GC7G7CACG7AGAA7AACACACAAACCC 

CTGAG7CC3GAATTCAAAGCCCTCCACACTCTCCTCTACCTTTGCATCTT 

TA7CC7CCACCACAC7GCAG7GCA7AC7C7GGGC7AC7AC7CAC7G77C7 

-GATTCAAATTCCATGTTCTGTCAGCTCAAATCA7TCTC7CT3CC7GGAA 

-AAC7AC77CA7ACA7A77CTGC7A77GAA7TC77G7C77AGCACCCCA7 

C7AC7CCAAGACGA7G7CCAG77GGGG77AC7CCC7G7CCCA7777C777 

GAT7ACAC77777TT7TC7AC77CCA77A7AT7A77GA7CACA7C7G7GC 

CACAG77777GAC777G7G7C7GCT777AC7C7777C7AGACCC7GA7AG 

CTCCTGAAGGG77GGGTCAT7TCTTTTTTATTTGC7CATTCCTCA7GGCA 

CAG7GAG7GC77AA7AAA7GGC7A77GACTGAAA77AAAC7G7A7C7AAA 

-GGACATA77CCAC77C7GGGCCA77CA77CT77CT77C7A77GGAACCA 

GGAGA7GGGGAACCA7AACAAAGG7AAGG77GTGCCA7G7GAAAGAACA7 

GGAACC77CCCC7GAGGGCCAAAAAAGAGCAGGGAAAGG7GCAAAGACAA 

AA7C77CCA77777AAACAATG7AAGAA7G7GG7CCACC7CAXGC7CAGG 

-GGGAC7TTA7CA7GACG77A77777GGGGAC77A7AGC7GCA7CA777A 

CCCCA7A7ACA777ACCT7TAG7G7AGGGAAC7GAGGAC AGGAA 7777G7 

-GA7GCAGAC7C77GC7AA7GAGGCTAACAC77GGAGAA77777A7CA7G 

CA77CAAGAAGC77G7Tr7ACA777C77CA77AA7AC777AG77GG7GG7 

-7AGC7TTAG77G7AGGCT7ATCAGA7A777GGAGA7A7C77CA7AAACG 

VrGGCT77GG7777AGAAGAG77A77C7GAAGC7AC7A777C7GGCAA7A 

\7CAAACAGCA7GGCCA777G7777G7AAGGCC777CC7AGAA7A7GACG 

37AAAA7C7ACG7G7GGAAAAATGC77A77C77C7G7CC7C7A7AAA7G7 

3AA7C7AG777G7C77CAAAA7GAAA7CAAG7GA77AAAA7G7AG7777C 

TAAGAAGA7AAA7GGAGCAAAGCAC7CTG7G777CACAG7G77GGAAA7C 

ACTCA7CCG7CA7AAAAC7G7CCCAAC7GA7CC7GAC7CACA7GAA7GAA 

77AAAA7AAGAG7TAA7AACA7CAA777ACA77777AAAGACACTr7CCC 

ATGTr77AGAC7AT7GG77GGAAAAGC7GG7AGG7G7ACAA777G7GGAG 

AGTTGGCTGTrrrTGTCTGTCGTTG7TTGACG7AT77CAAAGCCA7A7C7 

AATTT7GT7GCAGAATGGTCTGAAT7C7ACAAAAA7G77GAGTrG7G7AG 

7GTGGAGAAG7ACGGAGCCA7TTAC7GAAAGGC7GGGGGGAAA7GACGAG 




AGGTGGGGGC7GAGGAAGCAAAG7TGAGGA7AA7TC7GAGAC77C7AGG7 
-GATCCAC7GAAG77ACA77A7TCAACACCACAAGGAAAC7AGGGGAA7G 
AGAAGGCA7AC7GG777GCTT7GGAG7GGAAGGGCAG7GA7G7AAGAGGA 
"-77AA7GAG77AAAG7TTGGA7A7GCC7GAAC77CAA777GA7A7G7GCA 
GA7A~ACCC77GGGG7GACCC7CCAGGCAA7GG77GAACA7G7G7A7 
— CT7AGTAAC7GATAGGCATCACAGAC7CACA7CAG7AAGGAAGCAACA 
GCAAAC77GA77GGACGA7ATACCTGGAAC7CAG7ACCC7A7GAC7GGAG 
CAAG7C7C7G7CAGTGAAA7GAGGATAAGAAGAA7C7TGACCT7G7GGAA 
-A7G77G77AGGAA7A7A7GTGATGAACAACA7AGGATAC77CC7ACAGG 
GC7CCACA7G7AG7AAGGGCTTTATAAA7GC77GA7AAA7A77A77G7TG 
T AA7TTA777CCAAAGTAAGATGCCACTGGAGGAA7CTr7GGAACCCAAA 
T7AA7AACAAATAGGACTGGATGCAATGGCTCACACCTG7AATCCCAGCA 




AGCCTGGGTUAUALAlJV»*aft»«UUi avj inn.iniunnu«»» 

-AACCAGATG7GGTGGTGCACGCCTATAG7CCC7GCTGCT:GAGAGGC7G 
AGGTGGGAGGATTGCTTGAGCCCA7GAGG7TGAGGCrGCAG7GAGCCA7A 
A7TG7GCCACCACAC7CCAGAC7GGG7GACAGAG7GAGACCC7A7C7CAA 
A7AAA7AAA7AAA7AAA7AAATAAA7AAG7ACAAACCAGCAAACAC7AA7 
CCTrTC7AGAGAT7AT7GAACTCTGGAGGGCAGA7C7GAA7GGAGCCAGC 
AGAGGGACC7A7GGAGA7CAGCCTGGCCCTGGACAGCACCAGGCAA7GGG 
G77GC" r AGAGAGG7AA7GGGGT7GAACAGGG77TAAGCCA7GAGG7C7CA 

^GAATCCG7GAAGAC7CAGACTAA'i"n"l"i'1 1 1 1 117 TGCA7GAGGA77AG 
G7G7TCC7AGGAA7TrCAA7GAGAGCAGGGTTAA7GAAGGAA7GCAGGG7 
AGGAGAGC7GAGGGAAGGCA7C7GAGAGAGCC7GGC77A7GAA7GGC7GC 
-7CAG7A7GGC7CACC7GC777CC77G7A7C7AC77AGCAGA7GA7CCCA 
--"CAGGC"-CCAGGGCCAAGG7CA777CCACA7AG7CA7GGGCCC77GA 
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3GGCC7GGAGCAG7G7A*wGAAG^ ^ 
3TCATGGTGCa . GGCAAG7G7CG7CA7CC7A7GCCAAGCC7GA7C7GAAG 
3GG7GCA7GC7CA7AGG7AGC7GC7GCCCAAGA77ACAGCAGC77C7~CA 
ATCCGAGATCCATGCTCTCCTATATTCATTTTTCCAGGGGTTCCTGTCCT 
7CGACAGTGATGAGATGCAGAATGACTTATTGAGTTATTCTCCTGATAG7 
rGCC ^S;^ CC ^ T ^^ TG ^^TGGAGCTTGAGAGTGGAAATG 
AGGCCC7AGGGA7AGCG7GC77AGGAAAACAC7CCCAGCC7GA7G7AA7 m 
ZTuGGGGTACAATGGCATTTTCATCATCAAGACTGATGTAAAGGGTGACT 
AGCAG7GAG77GGGGG7GAC7CGCAC7GGGGC7AGG777C7GA77C* T, GCC 

C CTCA^TTAATGTCCTGGAAAAAra OTTr r n r: a TT^TTrj^T t n % ^/-#~ 




GGCC7GCCC7GC7GATGCC7GCCC7GCCA77CC7GCG7G7GA7G7C7C7G 
GGGCA7C77GCC77CCC7GCCCAGACC7G7AG77CAGC7GAGGGCA7G7G 
GAGGCCAAA7GGC77C77AGAG7G77AC777CC77GAACAGC7C7GC7GG 
GAGAACTGGAGGAGCTAGCTAGTCACGGTAACTGCAGCAGTCAAAGGATC 
GTCCCGGTGGAGGTGGGGTGGAAAGGTAGAGAAAGAGAACATATAGCG7T 
T7CC77GGAGA7G7G7GGGCA7G7CA7AGAGGAAA7ACCCAA77CC7GAG 
CCTTGAGCCCTCCAGGAAACCTTGGAATATTAGGTTAGTCATCCCCAAGG 
AAG7C7AAGAA77C7GG7C7CACCCA7C7CC777AA77CCCACAA7GA7C 
C7ACA7GA7A77AAGGAACACGGGCCAG7AACCCTCCAAGCAA7GGA7G7 
GGTGG7GAAGTTTGACCTCATGATGGAGCGGAGGTTGGTTTGAAACCTAA 
-•AA777AA777A77G777CAAAC7G77C7CCAC7CAGCG77A77AAAGC\ 
TACA7AA77GACACA7AAAAA77G7A7A7G7C7ACGG7G7ACAA7G7GA7 
G777CGA7C7A7G7A7ACA77G7GAAATGA77ACAACAAGC7AAA7AACA 
7ACCCATTCATCG7GTTTCAAAGGAATTAAACTCAAGCACAAAAGAGAGG 
7GC7G . . GAAGAG7 AGGGC TGC7C7A7CTAAG7AG7ATG7C7GGGG77G7 
CC7GGA7CAGGG7CC7777G7GC7AG7AA7AAACCAGCCC77C7GGGGC- , 
GCTCCAC777CCCCACA7777C77C7GGAGCC7CCCTAAGAAT7AGGACA 
7GGCCACr77C7C7GCA7AGGC7TCC7AC77CAACAAGGACAGGGC7TG7 
GCTGCCCCA7GCCACTTGAG7G7CCCTACAGCACAGAGC7GAG7GCACAC 
7GGC7GAG7GAGGAAA7CCCCCAGA77AA7C77GG77C7AAGCA7CA7GG 




TATTGAAATTCCTAGGACTTTTTATTAGTTTTAAAAAATTATACAAGC7T 
AGAGTAAGAAATTAAACAGTGCAAAAGAATTCACTGTGAAAAGTAAAATG 
ZTCTGTCTCTGCTGAGAGACAGATATTGCAGCCCAGATACTACTGGGG7C 
AA7AG777CC777AAGCA7GCCA7777GA7GG777A7GGGAC77ACAGC7 
IAAGAAGC77GACAC7AGGG77GA7C7CAGAAAA7CA77G77GCAGG7A7 
TAGATATGACCGTCTCATAAAGATACACACACAGACACAGCGATTGGAGA 
rATTCACTGGGGCTTATGGGCTGCTTGTCCTTTCTGCTCTGTGCCTAAGT 
TGGGCTCAGAGTAGCCTGGCATCGGCTGTGGGGAGAATGCTGGCATGGGG 
TTAGCAGGAGCCCACTTAACATGTCCTAAGCCACCTGGAAGAGTCCTTCA 
AGGAGACCAGACTCCAGAGGCCCTAAGGAAGGAAGGACTTTTGCCCGTTT 
TTAGGTAT 7CTA GTCCCAGAGTTTAGGGAGGAATGGTTTGGCTTTGGGTC 
3TGTGCCC CTTTACCGAGTGGGATGGGATGTGCCCATGAGCTGTTGAGCT 




TGGTTG 



7AGTGACGGTGGAGGCTGAGGTGGTAGAAAATCAGAGGACAAACCCCATG 
GGCTGC7GG7GATCTGACCGAGC7CC7A7GCTC7CCTGG7TCA7777AGG 




AGCAGCAGG 7AGA C7GGC7GAAGACAGACAGG^ 

GA777G7G77777AAGGAC7777AAC7GGGGAGCCC7CCGGGACAGA7CA 

GA7GAGAG7GAAA7G7GCTCCGCCT7AGCC 

>Contig37 

GGCCG77CGCAA77C7G7AAAAGGGAGAG7GG7777A777A77777AAAC 
A7AG7CAAGC7GC7AAAG7A7A7GA7A7G7A7AGA7AGAG7A7AA77AAA 
7AC777CAAC7ACAGACAAAA7CAGGAGAA7GGAA77AAAAAACAA777A 
CAAA7GGG7AA7GGCAGCA77GGG77GCGCCCACCCACGAGAAGGCAGAC 



Sf/i 
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^CCAAGATTCTAAGATC^NJACGTGGCCAGCACTTCAGACTTCAAATAGAri- 

rtcgtgattatgcattatttttctcggaaagttttcacttcactatatgc 
"acttgacacttgctttcctaagacatccctctatttttgagatgactaa 
ztcagcaattcatttctctcacgcataagctgtcactcaacccaaaccca 
ccaagcctgcattctaccctcaataaggtct7ggtgtgtaaactgaccca 
cttcacctagttcc7tagccctctcttgaccagacatgactctttcataa 
3 ctagac ctataaagtcagggct cttaagtagctgatctctgatagtgcc 
aXStgtcccccactgttcacattttccactccagcttctaacaggtgata 
gactgctttttgggggtaggggcaccaaaacatatagacctcatgtttgg 

AT GT AG AC ACT 

CCAG777CT77AAA77ACAAC7ACA7A7TAA7AA7GAC7 
TCCAAGTGTACATTTCAGTCCAGATCTCTCCCTGGATCCCCAAACTTTG7 
.^AAACCCACCGCCTAGTTGATATCTTTTGATGTCTGACAGGCATTTCAAA 
TTTAATACTGTCACAAACAAAGTTATTGATTTTCATCTCTGCATCTGTTA 
CAAA7777TC77ACTTTGGTAAA7AGCACCCCAGGCTGTG7CAC7GCCAA 
GAACTTTCCACAGCTCTTGGAATAAAATTCAAAATATTTTCCAAGGCAGA 
AAGGCACAGTGTAATCTGGCTCCTGCCTACCTCTCCAACCTCG7ATCACA 
CTAG7CTCCCTG7CACTCACCCCCTCCAGGAGCTCAGG TATC CTTAAAGT 
'"' r C7777C777' rr 777 n T77 T ' n 'T T TTTT7GAAACAGT777GC7C7G77 
GCCCAGGC7GGAG7GAAG7GGCA7GA7CTCAGG7CACTGCAACCTCCGCC 
-CC7GGG77CAAG7GA77CTTG7GCC7CAGCC7CCCAAG7AGC7GCAA77 
^CAGGCGCG7GCCACCACACCCGGC7AA7777TG7A77TT7AG7AGAGA7 
GGGG777CACAA7G77GGC7AAACCGG7C7CAAAC7CC7GACC7CAAG7G 
VTC7GACCAC7TCAGCC7CCCAAGG7GC7GGGA77ACAGGCG7GAACCA7 
-G7ACCC7GCC7CC7TGAAG7TrC77GA7CCAGAC7CA77CC7GCC77AA 
GG7C77GCA7CTTGAG7CC7CCCC7CAAA7GACACCTCCA7GAAGACGCA 
A77ACCTG7AA77ACCG7G7C77A777AG7CAA7G7G77GG77T7CrG7C 
TCC7CCAC7ACAG7G7AAGCTC7A7GAAGGCAGAAACCrrGGCAG7CCAG 
77CCCAGCACAG7GCC7AGCACACA7AGG7A777AA7AACACACAG7AAA 
ATTCACC7T7TAG7G7GCAATrC7GAG7TT7GACAAATGCA7CAAG7CA7 
77AAG7C7GAC7A77A7CAAGCrA7AAGA7GG77GCAACAC7A7CAC7AA 
77CCC7CA7GC7CC7TGG7AG7CAG7C7CACCCC7AACGCCCCCC7CC7G 
GCAA7CACTGA7CCG777777G7CTTTATAG7T77GG7T77TCCAGAA7G 
CCAA7AAC7AAG77TTGAA7GAA7GAA7GC7ATTAACTC7CA7T7CTGAC 
TCCAGAGCAACA7CCA7GCAA7A777A77A777CAGCCCCAAA7AC7GCC 
CCC7CACC77CAC7CCAACCACC7AC77GA7GA7ACAAGG7GAGACA77T 
3GCATGTGCTTCCTCCA7G77CCTAGCAT77TCCCTA7CTCCTTAGCC77 
CC77C7AA7CA7AAACCJAAGAGTGAAC777CCC7T7C7AAAGGCAAC7TA 
— CC7AGGACC7CGA7GCCA7AA7777G777C7C7AG7AC77TCTA7A7A 
-ACACCAAACAA77AGC7CCAGAAAGG7AAAGAC7CAC7G7G7GC7CA7C 
^C-GTG7CTCC7AGCGCC7GGCACAC7GCAGG7GC7GAAGAAACACC7AC 
AGAA7GAG7GAA7GAA7C7C7CCC7C7C7AGAC7CC7TCTC7777G7AA7 
CAAACA7GT7CAACC7GCAACACAG7C77ATGACCAA7CC7C7G77G7CT 
GACC7AGGC7GAGCTCCAG«3CTGGGACCC7GACTTCCTTA77CACCACC 
TCAAGG7CTrrGCACTCACTTC7CTrTC7GCTCAGGATTGTTTTTC7TCT 
TG7CACCAG7C7TTTC7CAGAC7TAGG7C7CAGC7CAGACAT7GC7G77G 
AAAG7AC77CTAC7GATCCTTTTA7C7AAAGiCAGCCATTCCAGCCCTACT 
C7C77GA7CATAGCACCC7GAA77AAG7TGT77AC7TAC7GTC7C77CAG 
GAGGGCAAGGAGC77GG7GG7GG7G7TCAGGGC7G7ACCAAGC7G7ACC7 
""GC"TCACCG7GC7ACAC7Tr77AGCAACCA7CrAA7TrrACA7GC7CCC 
' r 7CAC7CG7CAGAAATTTCCTTA77TrC7ACrrCAAGCAGG7A7ACA7A7 
G7GC77C7CG7GGGAGGC7CACCCAC77CA7GAGAC7ACA7TrGG7CC7G 
GG7AGAAAG7G7ACAAAA7CCAC7GGC7CAG77T7AATCAA7G7A7G77A 
ATA7TAACCAACCTGAGA7CTrGA777CCACGCC7GGC7AA777TG7A77 



~C^G7A777GAAC7GAC7GC7CCTGCrTGAQ^i;AUX^ju> ^M^' ; . i. *w 
^"CAC"CAGAC7CACGGAAG7TrC7GGTrC7TCCC7GGTAAC7TTTCTGA 
AC-TAACCAC7GG777GC77GACAAGAGA77ACCA7C7TC7CAC7TCC7A 
3C7A7G7GAAC7CAC77A7C7GC7C7A77GC7G77CAG7C7AGCACGGCA 
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777G7. -CAATTTCTTAAAAAGAAAAAAAAAAAGCTATTGTAAACATACG 

G7C. .AATAx i. . G77A7AT7TGC77CAAA7C777777CAGAC7GTAG77A 
^^i« A £x7^5^^^^ A ^^ A ^^^^^^^^^GACCTAGTCTTC 

^ Aa ^2^IP gc ctaatcatct:aagttgcaaaagcttagaattaa 

- iu1 - juuic - watttacggttgtgacatcagc^tgagttttgggagc^ , 
g7c77g77cagaaaa7gg7tc7ggggaacagcc77777caac7~ggag7c 

ZAAAGTCTGTGCTTTTTGCTGAAAGCCATTATTGTTATGTTTATTACCAC 
7GG77CCA777GG7C77A7GC7AGGGG7GC77GGAA7GGC7GAA77AAA7 
C7GCCAAC7G7CAAA77AGGCCTC7GGCX7ACGGC7777GAC7777GCAG 
TACACATGATGTCTGAGGTATACAAACTTGGCTGGACTTCTGATCTTGC7 
TGATGTTTGGATGTCTGTTGTTATATTCACCCTGAAGCAAACTGGGGTAT 
G77C7GGG777GG7G7GC77CAC7C7C7G77CAG7AACAGGG7A7GACCG 
7A7C77AG777CA777GG7CT7TCA7A7TGAC7CC7A77AACC777A7A7 
CTTTGATG7TCTTGACTACTGGTTTCTTTGAXGACTGAACTTTACTAAGG 
GTCCGAATAAAGTGAGAGGGAACCGTCCTTGAGGGTTTTACTCCTGGTCT 
7GCAAGA7C7GC7CC7C7AGAGAG77GC7G7GA7777AC7GGGAAAG7CC 
7GC777G7G777C7CCAACAAATTG777A77AACCC7A7C777CAGAACA 
GCACTATTAACTGAACT7TTGCCCAAGGCTTG7T7AGGAACTAAACTGTT 
C77GG777GA77A7AAGAG7CAG7C777GGC77AC77C7GG7A7A7AAX~ 
7AGGA7C7GGC7TCC7C7CAGGT7C7G77AAGA7A7C7AGCAAG77C7C7 
77G777G777C77TTAGAAAGTTA7CCAAAGA77CG777TCAACA7GGA7 
A77A77CA7AAAG7C7A7ACATT7ACCAT77CC77GA7C7GT7AAC7GC7 
GC777G7AG7777CAAT7GCTCTA7A77AAG7GACCCCACAGGT777C77 
GACAG7C77CC7GTGGTGGACTATC7AGCT7CACAC7G7TGAAAAC7CT* 
GCTGAAAAGC7TAGACTATGGGTTAGAAGAAACACA7T7TGAAGTCCGCC 
7TTTGCCCAGAAGT77TGGTGGCTC7AAC77CAGC77C7GGGACCCTGCA 
GTA7TAGG7GG7CTGGGCTGGAGTT7AATGC7GA7GGACCTTTTAGGTTT 
GACAGGCAAAACAACATGGTTGGTAACA7CA77TT7GGGTCTAATAGTC7 
GAAA AAACAAAGAAAATACATATTAAAAAA7CC77AACATATCTTATTG7 
riTrAAAA7AA7AACTGTGTTTAACACATGC7AAAAAAAAAATCATrTT7 
AGAA777CA7C7AAGAAAGTTGAATCC7CAGAAAG7AAAGAAAGACTCAC 
7AA7AGG7AG777TTGTGTTTTTTT7T7T77Trr777T7GAGACAGGATC 
77GC7C7G7CACCCAGTCTGGTG7GCAG7GA7GCAA7C77GGC7CAT7GC 
AACC7C7GCC7CCTGGGT7GAAGCAAT7C7CCCACCCCAACC7CGCAAG7 
GGC7GGAC7ACAGGCGCAXGTCACTACACC7GGC7AC77777TGTAT777 
7AG7AAAG77GGGG7TTCACCATATTGGCCAGG77GG7CTTGAAATCCTG 
ACC7CCAG7GA7CCACGCACC7TGGCC7CCCAAAG7GC7GGGAXAACAGG 
7A7GAGC CAC CACACCTG7CCTAACAGGTAG77777ACAAC7TGAG77CC 
7A7CAGAAG7A7A77AGAATCTT7TAGCT7GACAGAAT7AAGCAGAGATG 
CAG7GAA7A7ACAAAAC77GC7C777CAAAAA7GAA777GCC7CAAACAG 
7AG77G77GAA7GCCTATTATATCC7AAG7GCCC7CCAAAGAACCCTGAA 
AAAA7ACA7ACA7AATGAACTTATG7TAGGGIACC7CCCAACAAAXCTC7 
CC7AG7AC7T7GTATAGCCACACTA7ATGTT7777AAACCACTGCC7TTG 
TAAACA7CACAG7A7CAC7CAAGAACC7C7G7C7CA7CCCTGGAGA7CAG 
7GACAAGGAGA7AGGTGGCAGATGA7G7GAGGCC7GAGATATGCTGCCAC 
AGC7C7CAA7AAACATGTAACATCT7AA7AG7CATA77TGTAAAATCAGC 
CAGGACAGGG7T7TAAGGTTAGAGTCTAXG77AA7AA7AAACAAATGT77 
AG7CA7G7GA7TTAAGT7TGGATAAGAAAGG7AGGAC7CGA7TACAGAGA 
A7777GAAAACIAGGGAAGGGAG777AGAA77CA7A7GGTAAGTAA7TGG 
GCAAGCCAC7A7GAA77CC7GAGCA7C7C7CA7GAAAGCAA77AC7CAGA 
AAGGAGAA777CACAGAGATTTATGGAA7A7G777CCAGGGTAAGA7AXG 
GGAA7GC7AGAG77ACCAC7C7A77777GA777GACAAA7A77G7GAAGA 
A7CAC7ACA7AAAC7TGGCGAG7A7G7AAAGGA777C7AACCAGAACCA7 
77GGCA77GAGGGCAAAGAAA7G7C7AC7C7GGA7GATAGCGG7G7G7G7 
GG7G77AC7AGGAGTGAAACAGCGGAG77GGGAG7GGGAGGCAGAGAGA7 
GGA7GG7A7ACCCACAA7GGC7A7A7C7GGA77AA7C777GAGCACCAAC 
A777A7A7ACAC C7 CGGA7C7C7CCA7CA77GC77AC7GAAGAGG7GGAG 

FIG. 3 (35 of 52) 



31 1 1 If 



PCT/US98/16102 

WO 99/06426 

GGACGTrGGCATGAAAG^iTCCAAATGTGTrrTTrrAGTTGCTrrC^Ai; 
VTATTAAAAACGAATTGATATAATCCACAAACCATAAAATTCACCATTTT 
nGTAAGTGCACACTTCTGTGGATTTTAGTATAGCCACACTATTATACAGC 
AATCACCACTGTCTAAT1CCAGAACATATTCATCACCCCTAGAAAGAGAC 
TTGGGTTTACTTGTTGGCAGTCCCTCGCCA 

-■GTC^ACATGTGCTCGCAAGATTGGATATTGAAATATCAGCAAGAAATTA 
AA*GACATAGTAGTCATTATGCCTAAATTATTG7TATTTTTTGATTGAAA 
^AAGTTGAATATTTCAAATATCAAGGTAGTAGTGAGATATAATAAAGAGA 
GAG7CAGTTCTAAGTATAGAATTGCTGATTCAGTTAAGCTCTGTTCTCCA 
ACAT^TGGGCCACATTGAAGAGACCATGTAGCTGCTTTCAGCCTCGGTTT 
-^^^■^GCAAAATGGGGATTACACTACCTGCCTCACAGAGATGTAAAC 
^ATGACATGTTATCATGATTGCCAGGGCCCACCTGTTTTCTTTTAAACA 
"^"GAAATCACTGTGCCTGAAACAGGGATTTCCCTGCCCrTTGTGCAAGCT 
CCAGAAACAGGAGTCAGCCTGAGTCCCGCAGCTAAGAACGTGGATTCTGG 
^^%T"TTrTrR.TAGCSAACACACTTCACAGGTCCTTCAAGGGAGTACATT 




CTGTGGv.CAAAiUW- innnl uuiiui.ni w»ww - * 

AATAATAGTACAGTCATTTTATGTTTCAACTGAACCAAGTCAGGGTTCCA 

-^^^-.^CCTCCCCTTTCTGCTCTGAGGACATCCATGAAGTGGAGGGGG * C 

'ATGTAGCCrGGAGCTATTGGTGAGGGGCGATGGGTCCGTGGTGGTCTTG 

GGGAACTGCGGGGCTGTGTCTGGCTGGTCTGGTGTCTGG7GATTGGCCTT 

—--CACGCGGTrCACGCTGCAGGACAGTrCGTGTCCTTCTTGTCCTAAT 

GATCAGCTTTTAGGCTCACGGGCCTGTCTCTGCTGAGATATGGAATAGGA 

CAGCTCTGGATCTTCTTTAAACTCTCCTGGGGCCACAGGGGACTCTGTT 

^GTGTCTGTGCCCACATAGGATGATrCTGCCCAGACCTTrGCTGCCATTT 

^^TGCTGTTCTGCTGTTTTTAGTCTCTGGAGGGCTTGCAGTTTCCTTGGvj 

GTCCCTGTGGAAGCAAAGCAAAGTCCTCTCCACGCTCAGATGTCTAAACG 

TATCTGGGTTTTATCGTCCACCCATCCCAGAGCTCAGTCTAGAGGAGGGG 

G^GCCTTCGGGTTCTCTCCTTCCTCCCAGAGCCTCTTCCTTTGCACCAG 

GGCAGCCTCTTCCTATCTGTTGGAAAGGGCTGTCTGGTTCTTGAATATAG 

. -^-»rw»^r;r!ft'rr'rftr^irTr:ar,GTAAGGCAAACTATCACATGG 



GGCATATCAGGGGTGAGGGGGCGTCCTGGCTACACCCACTAACTACTGTi 
GCTGAAGAAAGGCCTGGTGAG^TCACTGGGGAATGGTGGGGGATGAAGAA 
CA^CAGATGGATATTGAGGATAAGGGGATCTTGATAAACTGGCTTAG 
^AGGGTTTTTGCTAAAACTGGTTTTCATAGGTAAGTCCACAGACAGGiw. 
TG^jAGAAAGTTCAGGGACCTACGGTTTGTTCGGGCAGATGCTTTGTCATC 
GTCACAC^GGCACTGTa^CCTGGCTTTCCTTTAGTCCCTCCCCCCCTTT 
TTTTTTTCTGGAGTAGTTTTGGGAGACCAGAGGAGCAGGGAGTTAGGGAG 
AGTAGTCAGAAAAGGC CAGAGAAAATAAGGAGGTGTCTGTAGGGAAAATC 
C^AAATCC^TAATTAAATrAATTTAATTTATTTATCTGGGACAAGGTC 

AGCCTCGACCTCAGGGCTCAAGCAGTTrTGCCACCTCAGCCTCCTCAGTA 
GCTGGGGCTCACAGGTG TGCACTACCAT GCCCGGGTAATTTTTG^GTTTT 

rrr l i r ri ' i niii ' Li m 1 1 1 1 1 1 i gtagagatgagg7ttcgccatg 

•^TGCCCAGGCTTGGTCTCGAACTCCTAAGTGATCCATCCACGTCGACC 
;^^«T«rTRaGaTTACAGGCATC3AGC»CTGTGCCCGGCCTAAATTCT 



^GAACC"TAAGTTGGAAACATCTCTGAAGATUi i i^iu«awu 
^ ;: ^GAAGT7GAGTCrTTCATCACTAGGTAGGCGTGTTTTGGAGT 
'"^ATCAAACAGATCCTGTGTTTATTAGGAAGCTGTGGTTCATAAAGCw 
Z ^TGCTAATT773CAGGTAGCAGGG7GGCC CTGGCCTGACCCGGGGACA 
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GAG7GGC7G7CCTCCC1 -CAGGCAGGAAACTCT-CTCCTGCCACCTAGTvi W 
C " I " G ? C ^^ AT ' I ^ c:AAGGGA G^Trr CTGGGTGGTGAGT^TT AC CAC3ACT 

A7GG7C7GAGG7AGAG77AAGCAAAACAAAAC7AAAC7GCA7AAAGAAAC 
AGAAAGAAAA7CAGG7G77A7AAAAACAA7T7GGCA777G777G7G7 mw '^ 
AGC7CCG7G7CGA777ATTGCT7CCACAAA7AG7GCCGA7A7GCACCAGG 
CAC7G77G7AAAAC7GAAAA7A7G7TTT7GGA7G7GCCCAG7C7G7GAG7 
ATTAAACGATGGTTGATTTGAAATTTGCTATGATTCATATTTC7GGGGGT 
AAGA7GCAGGA77TC777GGGGGGCC7ACGATG7GGCATTC7AGAAT7C7 
7AAAGAA7CAACCC7GG7GGGACCAGGAAGAGC7GAGC7GAGGCC7CTC7 
GC7CA7G7G7AC7TACTGGAGA7CA7GGAGACAGG7GAGCC7GAG7GCAC 
G7C7CACCAAAGCCACAGCAGAGGGGGAGGAGGCGGAAAGAGAGC7C7— 
CCA77TC7GAGAAG77AATGG7AACAA7GGCATACATACC7AC777ACAG 
77GAAA77GGAAACCACAGCAT7AAGTGTTTCCAATGAAA7TTGGCAAT7 
7GGGAG7777C7GAGC7GCATTGGA7G7GG7T7TGCATGCTG7TAGGA7G 
AGCAAGAGA7GATGGAGAACATCTTCCTTTTGAGC7TCCTCT7GGACG7G 
GG7CAC7CCCAC7CA7GGAA7TAGAAAGC77AGACC7AGAC77GAA7C7C 
ACC77C7CAAGG7GC7CCCGGGCAAA7CACTTAAGA7CCA7C , rTCT7CTC 
C7CC7GC7CC77C7CC7CCTTC7GAG T T I I 1 IT I I T ' I TCTTTCCAAAATTC 
AAATGACACGG7ACTGG7AGAAGAAAAGGTCCAAG7C7GC7TT7ACAGC7 
CCCC7CA7CCCCAAA7GTAC7CCGACCCCAAGA7GACCATGT7A7CA777 
0A77GACA7CC77C7AG7TTCAAC7CA777C777GCA7GTA7A7GCACG7 
ACA7A7ACAC7A7777A7TT7GCCAGGGG7CACCG777AGC7GCAT7AA7 
77C77A7AAAA7AA7C7ATA7T7AC77A7GG777ACG7AAAACAACA7AC 
ACA7G7AAG7G7A7AGC77GATAAG7C77CAC7GTAAACCAAAAA7AAAA 
77CGAAGCCCCCCCAACCG7CTGAATGGACCCC7CTTCT7GGCCAAGAGC 
A7TCCAAAG77AACCTGAAAAAACTAGTTCAGGTCATGA7GGAAGGGAAG 
GTTGGACA7GCCCCAGTATACCC7TC7CCCTTTTGGAA77CAGGAAAAGC 
7GACCAGCA77AACA7CAACACAGACC7TA7G7C7GATAGGAAACTT7GA 
CAA7C7A77CCC7CTGAAGC7TGC7ACCCGGAGGCTTCA7C7ACAAGATA 
AAACCTTGGTC7CCACAACCGC7TATCATAACCCAGACA7TCC7TTC7G7 
7GAGAA7AATT7ACCTTGTAACCTGGAAGCTCCCTGCTTCAAG7TCCCTC 
ACCTTTCCAGAT7GAACCAATGTAAACCTTACA7GCATTGA7TGATG7A7 
TATG7C7CCC7AAGATGAATAAAAGCAAGCTG7ATGTTGAC7GCC7TCAG 
CACAGG77G7CAGGACC7CCTGAGGCTGGGTCACGGATGCA7CCTTAACC 




'AT77CCA7CAC7CC7CATCTACCCCCAAA7T7CCTTATGCG7C7T7GCA 
J7CAACC7CCCACCCCA7CCCCAGGCAAC7GCAGA7C7AC77777GTC7C 
7GCAC 777CAAC7GACCC77TC7G7GA777CATA7GAA7GGAA7CA7GCG 
C7GAGCAG7C77TTG7G7C7GGC7TCTTTTGC7CAGCA7AATGTTT77GA 
GG777G7CCATGTTTTTGTGTTTGTCAATGGTTAA7TTCTCTCCATTGCA 
GAG7AG7777C7ATTGTACATGTGTACCACAATTTGTA7A7CCA7TCCAT 
7GCTGA7GGACATTTGATTTGTTTCCAGAT7T7GGCAATTATGAATAGAG 
CTACCA7GAACACCCAGGTACAAGTCTTTGTGTGGACTTATG7TTTCATT 
7CTC77GGAA7GGAACTGTCATATCAATAAGTATATGTTTAACTTTGTAA 
GAAAC7GACAACAAATTATCTGCGATGGTTSTGCGITTTTGTTTTTCTAC 
CAGCAA7ACACGAGCA77TCAG7TGC7CCACAAC7TTGC CAAAACT7GTT 
7TC777AA7TTGGACATTTAAGTGGTGTACAGAGGCATCTCATTGTGGTT 
C7AG7777C77TGCCCTGATGACCAA7GGTGTTGAACATCTTTTCATGTG 
CTTT77GACCATTTACATATCCTCTTTTGTGAAGTGTCTGTTCAAATATT 
TTTGCCCATITAAAAaiTTTGGGGGTTTGTCTTATTATTGTGTTGGGAGA 
G7TCCA7A7T7ATTTA77TA7TGAGA7GGAGTCTCACTCTGTTGCCCAGG 
C7AGAG7GCAG7GGCG7GATCT7GGC7CACTGCAACC7CCACT7CC7GGG 
77CAAGCAATTCTCCTGCCTTAGCCTCCTGAGTAGCTGGGATTACAGGCA 
7GTGCCACCACAC7GGCTAAGTTTTTGTATTTTTAGTAGAGATGGGGTTT 




3GCCCA7A777ATTTTTTATTCTTTA7TT7GTA7ACAAG7TCT7GG7CAG 
ATACAA7AATACC7GGTCAGATGAGATAATGAG7TGGAAAATGC77TGCA 
AA7GGGGGAGAA7AATTTAAA7G7TAT7TATTTAT7AAGAGCAGAGGCCC 
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--CCTG77GCGG7CAC.J^GCCG777GC77C77C7GCC7777ATAAAw.. 

AGCAGAG7CGAGC7ACACAGGC7G7C7G7G77GGC7GC7A77AG77AA7C 

AGAGAG7777777777C77GCC77G7CA77C7AA777G7GACACA7AA77 

AGCCACAA7A7G7GTTT7CAG77G7GACAC7GGCC7GGGA AACCAA GGGA 

-3777AGAG7GGATr7CCT7GATrrrGCAA7AA77G7G7G77777C7GCA 

-C77 C7GT7AAACACAAA77CA7GGAAGCAAAACA7GGAAGCAAAG7ACC 

"7GGACA7CCCCCC77C777A7GAAA77GA777C7C77AAA7G7AA7G77 

TGC7T377CCC77AC777AAAAGCAA777AAGAG777A77GAGAAAG7GA 

GCCCTGGAAACA7AGA7GCA7AGAGAGAAAA77C7ACCACCC7CAGG7CC 

ZTA77G7CT7C7C7CA7AAAG7G7AG777CAGGGCC7777AGAAG777C7 

7T7G7GC7C7GA7TrGCA7G777G7GAG7G77GC7A7777AAG7A777GG 

VT77GG7C7GCAAA7CC7A7GAGAGA7GGCAACAGAG7AGGGA7C7CAAA 

GCC7GCAGG77G7A77AAG7CCAGCAGGGCC77G7A777ACAACAGAGGG 

TCC77GAAGACA77CCA7A7A77A7GC7AGGGGAG7GGCCAAGCAAAC77 

TAA7G7G7CCC7A7GG7GGGA7A777GGGG77AA7ACC7GCCC7TC7C7T 

AA777G77777C7777C7777777 C 77777C777C77777777777GAAA 

~G7AG7C77GC777G7CACCCANGC7GGA77GGAG7GCAG7GG7A7GA7C 

TCAGC7CAC7GCAACC7CCACC7CC7GGG77CAAGCAA77C7CC7GCC7C 

AGCCTCCCAAG7AGC7GGGAC7A7AGGCACACACCACCA7GCC7GGC7AG 

T7777777777777777GAAACNGAA7C7CGC7C7G7CGCCCAGGCGGGA 

C7GCGGAC7GCAG7GGCGCAA7C7CGG 

>Concia29 

-GC7CGCA7CCC7CA7A7CCA7GAG7G77C7G7GGGCCC7GCC7C7GAAA 

'AAA7CC7GCC777G7C7CCCAG77CAC7CCAGCCACCCATCC7GGGGC7 

GCACCC7CC7CC77CCAAGCCG7C7CCC777CC77CC7GG7GC7GCC7G7 

CA7G7CAAGCA7A7GCA7CAG7GCGACCAGGACA777GAAA7GCAACCAG 

TACAA77GGGCGCGG77A7GCC7ACCAG77777C77CC77AAACA7777A 

7A777A7G777GAAAGCA7GCCACC777C77CAC77GCCAAC7TGACAGA 

777A77AG77GACAACA7CCGC7GA7AGCA7CAG7AA7AAG77AA77G77 

777GCACA7G7AGCTTTAATrA77C7CA77A7CA777A7AGGAG77A77C 

Tr7G7AAAGGG7AAC7GAG7T7TCCAAAACAAACAGAAA7TTGGGG7GGG 

CCCA7GGAGCG7GAC7CA7GAAA7CAGA77C77AGAAGGACC7CGGCAAG 

TC7C7GGG77GC7GT7AA7GAGCC7GGC7GGC7GCCAGGGG7G7G7C7GC 

CC777A7GAGGCCACCAC7G77CAAA7GC77GCC7GCAGCAT7ACTTGCC 

7AGG7AG7GC77G777C7AC7GAAC7G7CAGGGA7CCAA77C7TrG7GG7 

C7AAG7AACAA7AC7CAGA77CACAAGGAA77GA77AA7AAGCCAGAA7G 

"CAA7GTA77ACATTTT7GA7GAAGACCA7A7T7ACAG7GAT7G7A7C7G 

-7CAAGC7CAAA77AGGA77AGAG77C7GACAAATACA7A7G7GAGAAG7 

^7GAGG77AAA7AC77GAAA7TTGGAC7777C7AGAAAA7C7GAA7G7G& 

"7GCCA77CACA7ACC7T7C7GGGGATGA7GA77C77G7AC7777A7777 




rGGCTT UAi Uv- 1 - i w ij^_UHVJV_Ak. 1 X X uvjunaww-inviv. . 

GC77GAGC7CAGGAG7TTGAGA7CAGCC7GGGCAACA7GG7GAAA7CCCA 
^C7C^ACCAAAAA7ACAAAAAAAAAAAAAAAAACAACCAAAAAGAA7AAA 
T7AGC7AGG7G7GA7GGTGCGTGC77G7AG77CCAGC7AC77GGGAGGA7 
GAGG7GGAAGAA7TGC7TGAGCCCAGGAGGTGGAGG7TTCAGTGAGC7GG 
GG77GCAACAG7G7AC7CCAGCC7GGGCGA7AGAG7GAGAC7CCG7C7CA 
AAAAAAAAAAAA7CAGATTGCTT7A77GC7GG7T77C7TTC7AAAAC7GA 
GA77GGG7CCCA7CA7CCCCTGGCCCCCA77GG77AA7GGT7CC7CCT77 
G7C7A77GAA7AAAA7ACAGA7G7C7GC7777GGCAACA7GGTrGAA7G7 
AGACAC"GCAGGG7C7TCCTGAC7CAAAA7GAG7AAGGC77AGA7AAAAC 
ACAT ^GAAA7GCA7TrC7GGA7GAACAGCAAGGAAAGGAGA7C7C7TA 
J ^ TC ^^^ CTGTTCCCCTCTCCC T AC CCCC7CCAAG7GGGC77AAG7 
AGGAAG^7GG7GAGCGGCAGG7AAACACACG7CAAAGGCAG7C77CC7C 
7C7GAGGGAAAACAC77G7A7AAGCA77GCAA7CAA7GGGCC7CT77AAT 
7A7G7GCCAG7GGCAAGAGCGGG7GC7GAACCCAGGGGCC7GCC7CAA7C 
CGGGGCC777GAGGCAC^7AAAG7GG7C7CAGG77G77CK^TTrCC77 
GCCC77CCACCCGAAGCAGACACAAA7CC7C7C7GGAGGCAAG77CCCSA 
A77CAGCCAG7ACAAC7CCC^CAGAC7AAGA7C^7CA7G7ACAAGC7wA 
"AGAC AAAGG7CACCAAACACACAGAGCAA7AAACAAATTCA7GAG x GAC 
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J i^l^^r^^^^ TAAC CACCAGCTGGGATGCTCTAAC,. 
Z . . ^AGv. . * -AGAATTCC7GAATATAGAATAAAAC7GCCACAA7GGCAA 
nCATGCATCTAGTACTTACTGTGTGCTGGGTTCTAAGAATTTTGCACAT" 
3TGCCAGATACCGACTCAGCTTCACACTCACCCTCCTACTGTGCC'*~'~-~ 




: ^S^^I; CTCACAGTGTGGAGT 'TAAAGGCATGGGACTGAGA 
: G ^ G ^^i;r-J AAAGGGACAG TGGCCAGTAGA7GACCAGGGGCTACr 
^i^"^;:i A ^^ CTGTGG ^GCTCAGAGGAGCCTTGGG7CCTGCA 
^ T wAGCTTTCTG TAGTTCCTGATCTCTGGGTCCCACAATCTT 
CCCCG x .TTTGCTCCTCCACTTCTAATTTTGTAACTGACTTCCCTGTGTG 
TACTTCTCTCTCTGATTGAAATAGCCAGACTGGTTTCTG7TTCCTGATAA 
GACATTGTC7GGTACGAACACAGTAAC7CA7T7AA7CCGATATCTCTATG 
AAGGAGGTACAATAATTATTCCTATTTTACAGATGAGGAAACACAGCAGA 

gaaataaagtcaattgtctaaggttgcacatttagtcaagggaagggttg 
atataacatataattatttagaaaacatctaaggaaataaaaggcataat 
ttaaaaataaaactaggcaggtttaaaaaaatgaagtaatctataagtaa 
aaaagtataattgttgaaatacatatcttagtggatgggttaaatagctg 
aagaaatgattaatgaactggaaggtagttctgaggaaatcagaattcag 

CATAGATAGAAAAAATGGGAATTTACAAAAG7ACACAGGAATTATAAAAG 

AGGT7AAATTATAGGGAGGGTAGAATGAGAA7TAACATTGGTCTAACTGG 

AATTT7GGAAGAAGAGAATAGAGAGAA7GAACAAGGCAA7A7T7AAAGAG 

37GGC7GAGAA7777TCAGAACCAACACAAAC7A7GAC77TACCAG7AGA 

GAAAACAA7G7ACAC7GAGGAGGA7AAATAAATA7AC7A7GAACAAAT7G 

rAA7AA7AA7AC7CAACAAAGACAAAGAGAAGA7G7TAAAA7CAGCAAAA 

AAAGAAAG7CAGACTTAGAAAGAAATGACAATGGCAGACTACTCAACAAC 

AACAA7GGAA7CCAAATTCGGTCAAACAGTA77T7C77CATGCTAGCATA 
TAGC 

>Concig40 

GGGAG7CCGCTA7GCTCCTAAAGATT7GCACCTCTGATCTGG7T7GTAGT 
7AGTC TC7Tr7A7TGCTTTATCCTAC7CAAC7AATTT7TTTAGTGCCTGT 
TTTTTTTTTTTTTAATGTG7GTTGATGACTACAAT7C7AAACTCATTCTA 
CTGA77CATGGGTGC7TTAAAA7C7GAGCAG7C77TCGCATTTACTGCC7 
GTGATGGCCCA7CCCACCAGCTAAAGTGTG7GGCCAC7GCTTACAGCACC 
A7GTGATAACGAGTAAGGGAGAGATGCCGCCCAGACTC7TCTAGGAGCAG 
CCAGTAGGACC7TCCAGGGGTTGCAAGCAAACCACAGCAATA7G7GGAG7 
GTGGCAGAGGA7GGCCCCAAGAGGATGTGGCAGCGGC7AGTGCAGCTCAG 
-TTAGTC7GAGAGGAAATGC7GGAGAGGAGAGCCCAG7CTGTACAGGCA7 
GACAGCCACAAGGAC7TCAACAGC7AACA7GGC7GAG7GGAC777ATG7G 
C7A7C7CA77ZAGAAAACAGGAGCAA7CAGAAAGGAG7CACC7CC7ATTr 




GTGATC7GTCTCACCTACGTTGTGATTCACATGAACTTACTAATGTGCTA 
TGTGACAACTACQVTCTTAAACACAAAAACCCTCTTTTGATTCTGTGGCT 
CCCTCCAGCTACCCCTGCATTTCTCTGTCCCCCTGCCCCGTCTCTGCACT 
CACTTTTATTTTACAGCAAAACTACTCAAGGGAGTCTCAGTGCTCCTTGG 
CTCCATGTCTCCACCTTTCATTCTCTCCTGAGTTCACTCCTGTCAGGCTT 
CCGTCCTCAAGCTCTTCTTCACTTTTGTTCTAGGGCCGCTGACATCCTCT 
TTCTTGCCAAATTCAGTGGCCAGGTCCTCACTTACTCAACTGCTCAGCAT 
TGTTGGGCCTGGTGGACCACATTCTCCTTCACCCACCT77TGCTGCTCTC 
T CTTC TCTCCAGATGTTTCTCTCTTCTCACTGGCTACTCCTCTTTTGTCT 
CCTTTGTTAGCTCCATTTCTTCCTTCCAACCTCACTGTGCTGGTGTGCCC 
AGTGCTCAGTTTTTAGCTATTCTCTCTrTTCCAGTGGCATTCATTAGATG 
GTATCATGTGACCCATGGCATTATATGCCTTCTACATGACAGTTACTCCT 
GAATATG ^TC TCAGGAAAGATTTGGATTrATTTTTAATTAATTTTTTTA 
AATTTTATTTTAATAAATGAGGTCTCTCTCTGTCATCCAGGCTGGAGTG7 
AGTATTGAG7GATGTGATTATAGCTCACTGCAGCCTTGAACCATGGGCTC 
AAGTGATCCTCCTGCCTCAGCTTCCTGAGTAGCTGGGACTACAGGCATGT 
GCCACCATGCCTGGATGACTTTTTGTGTGTGTGTGTGTGTGTGGAGACAG 
GGTCTTGCTCTATTGCCCAGGCTGATCACAAACTCCTGGCCTCAAGTGAT 

FIG. 3 (39 of 52) 

^////Sr 



WO 99/06426 



PCT/US98/16102 



CC7C7CACC7CAGCC7:. .CAAAG7GC7GGGA77ACAGG7G7GAGACCA. J * 

~7GGGC7AAGA77CAGATTT7G7A77CAA77GAC7G777GACA7C77CAC 

77GGACACC7AAGAGG7A7C7CAAA7A77AA77AAC77GGCCAAAA7ACA 

3AAC7777GACCCC7GCCCCCACAA7AC77GCCCC77CCCCAGAC77C72 

CA777I7G77AAA7A7CCCCAG77AC7CAACCC7CAAACC7A7GAA7GCC 

C777GA777C777CT77CCC7CA7C7CC7ACG77GACGCCA7CAGC7AG7 

T77G77GCC777A7GCCCAGAA7A7AA7CC7CACCACC77C7C7CC7A77 

3CCCGAG7A7AAGATG7CAG7T777CC7GCACAG7CCA77GCCC7GACC7 

CC7GAG7GG777GCT7CCAC7777GACA777G7A77CC7C777CCCCCAG 

GGTCAATTTTTCACAGCAAGAGTGGCA , i l, i''i , rr , i ,, i' , ri'T riT'lT'l'TTTTTTG 

^GACGGAGTCTCGCTCTGTCGCCCAGGCCGGACTGCGGACTGCAGTGGCG 

CAA7C7CGGC7CACTGCAAGC7CCGCC7CCCGGG77CACGCCA77C7CC7 

GCCTCAGCCTCCCGAGTAGCTGGGAATACAGGCGCCCGCCACCGCGCCCG 

GCTAATTTTTTGTATTTTTAGTAGAGACGGGGTTTCACCTTGTTAGCCAG 

GATGGTCTCGATCTCCTGACCTCATGATCCACCCGCCTCGGCCTCCCAAA 

GTGCTGGGAT7ACAGGCGTGAGCCACCGCGCCCGGCCAAGAGTGGCATTT 

T7AAAACCATATATTAGATCATTGCTTTTGTGTTTGGGAACCTCCAAGGG 

CTTTGCATCATATATCAAGTTGACACCTCTCCTACCCAAGCCTGGCTCTT 

TCCTGCTCCTCTGTCCTCTCAGCCCCTCCACCCATTGTTCATGCTGCTTC 

AGCCACACTGGCCTTCTTGCCATGCCACATTTGTGCTAAGCCCACATCCA 

A7C7CGGGGCC777GCAC7CGCA777CC7C7GC77GGCA7GC7G7ACCCC 

AGA7C777CA7GAT7GGCAGC77C7G7ACA77CAGCCACC7GC7CAAGCC 

^CCC CAGAGGGCC77CCC7GGCCACC7CACC7GAAA7AGCACC7CCG 

A77GCACCCA7CCGGTTAT7C7CCA7CC7G77C7C77GC77GG7GA7777 
C CA7 CAC 7GA7GAGGAAA7GAACCA7GGAA7G C7AGGGC7GA7GAC CAGA 
AC77 P " r CCCCACCCCCACA7TA77ACAGAGGAGGAAA7GAGG7CGGAGG7 
AAGA7GGGCCCAGGATT7C7AC7CCCGCC7GGAC7GCAGGCACAGCAC7G 
ACC7CAGC7G7GCTCAC7C7TGGCA77CACCCAACCC77C7A7C7CCAAC 
TGCCCCATTTACCAGAAAGTGAAA7G77C7CAGAGACGG7GAGCCACCTG 
ACTTGGACAGCAGCCCAGGGCCCC7GGCACCCTGCTTTCT7CC7CCCTGC 
CA7CC777CC7C7CCAAGACC7ACC777CCC7G7GA77C77GCCC^£A7G 
CTGCA777CA7GG7TTTATGACCTGATTTC7GAGAGGGA777GAA777TC 




TATTGCTAATTAAGGACCCAGC5ATGTGGGTGAGATGTGC7AAAAGCTGAG 
AGGAGGCTCTGGACTCTGACTATGGGCCCACACCCCTGGGCAGGCATCAC 
AC T AGTCCTTTAGGTCATCCTCAACCCAGCTTCCAGTTGAATCAGATGTT 
-3TGAATAACTCAGCAAGGCTGTATGGGAAATGAAGAATGAGGTGGGGAA 
GAGGCCTGTGCAGAAGACACACTGACTTACCCCTCTACCrCTAACTAGGG 
"GTTGTAGCAGCCACCCACCCACCAAGTCTGTCTTCCAGACCACGTATGC 
T TTCCTCCACCTTTGCATCTTTTATCTTCTGCCAGCCCAGATGCTTGCTG 
ACTC CAGC C CAAGC CTATAGGATAAG CTACAGC CTGTCCCTACAGACT AC 




rGTGGTGTGTTACCTGCTGCAGGTGCAGAGAAGTTGACTTCACAGCCCTT 
CAGAAAGACTGCCTTCTTCCAGTTGTATTTGTGTACTTGCTTGGGTGTGG 
GGAGGATTCTCAGCTTTCTCCACTCAAATTATCAGACCCTTTCCATTTAG 
TGGTAGACCATTTCCCTCGTCCAGGCCAAGGGCACATAGTACAGAGAAAT 
AGGGAGTTG7TACCCAGGGAGAGAACTTGGCTCTAAACCTGTAATAGAAA 
C^TCAGTTCTGGTCTGGAGGGTCAATTTTGATCTTTGGCTCAGATCCAGG 
AATTGGAACCAAGGCTTTTGAACATTTTAATGCAGGGGATTAAAAAAATG 
ATACGAGTCATTCACGAATATArrTGCTTAACATCTAAAGAGATCCCTCA 
AAACACTAGAAAAAATAAGAACAAAAATCTAATAAAACAAAATTTGTTAA 

ACACATTTACCAA ATT ' l ' l 111111 1 GGTAAAAATTCAAATGTCATAAATA 
AAGCTAAAGTTCCTCTTGATGACTCGCTCCTCTGCCCTATTCCACTCCAA 
GTAACCACTATTATCAGTCTTGCCAATACCCTTCCAGACCTCTCTACCTC 
TATATACCATTAGAAGCACATGGTTTTGCATTGAGGATGTGCAGTGTTTT 
G7TTTACG7AAATGTTATCACTCTGTTCTTGTTCCATAAT7TGCCTTTTT 
C^C CAATGATTTGCTTGGCTATCTTTCTATTTCAGTAGCATCTCCTTTC 
"~^~""AAC' r TACCATrGTTTA7TTAACCTTGCCTCTATCAACAGATATGT 
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I-A--™^^;^E^ C ^ ACCCAT ACAGAGATGATTrGGAATCT 

^;^-;^?;^7 C "GACAGCCATGTCATCCrTTGCACTG^ 

^^I^S^EZ^^^^ATTGAGAGTGAGCTGCCGAAGAC 

^AT--A^^?^ItIS^^^ AKAmTAmCTGAATAA TA 
I A Ii- A ^^^? AAAA iI^CTGTAGGCCAG<^TTC 
-^^^TJ^JEA^^CCGAGGCGGGAGGATCACTTCAGCC 
CAGGAG 4 - -AAGACCAGCCTGGGTAACATAGTGAGACCCTGTATCTACAA 

CTGTGACCCAACTCAAACCTC^TGTTGAATTTTAATCCTCAAT^r^; 

:^r:^I5 A I A SI^^ GTTCT ^^^CCTGGTTATTTGAAAG7 
^rJ^S^" - - -CwCTTCACTCTCTCACTCTCCTGCTCCGCCATAGTAA 

"^^Zir^i:^ A : ,JwTTCC ' rGTA CAGCCTGTAGAACrGTGAATCAGT~AG 
ACCTCT?TTCTTCATAAATTACCC^GTCTCAGG7CATTC~^AIAGCAG" 
jTGAGAGTGGATGAATATAGTGCCATATGTTTG7ATTCCCAGC"Af!P'*ar 




£J GGAAAA ^^^AAACCCAAAC7G7G7AAAA7G7G77CA7AAAAGTGTr 

ACTTTATTCATTTCTTATGTATCTTCCAGAATCAAAAAAAAAAAATCAAA 
TACAAGCACAGTGGAATGTATTGCCCTTCTTCCCCTCCCTTTTnTT 



A 1 ^^auau i I TGACCAACAAGG7C7G7TG77 

iS^il^^™™ 1 1 1 iLi i 1 1 tctgtgaacagactgttaagatccct 
; GG 5ix£ G F" GCTGGATTmG " CTTT,n ™ 

uAGx . rrTTACATGTGAAACAAGTTATCTCTTTATCTGGGGTGTGAGTTA 
CAAC7ACTTTTCC7CTGGCTTGTTTTGCGCTTTGACTTTGCTTCTGGTGA 
TTCCC3CAATTC7GAAAGTGTACTTTTTGCATCATTCATTCTTATACACC 
; -A7v,CTCrrG77CACGC7GG7TCC7CTACCTGAGGGC7TT77C I'l TTC7G 
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7G7AGAGA7GGGAG7C - _AC7A7G7TGCCCAGGC7GG7CT7GAAC7CC. J : 

GGC^CAAGCGATCCACCCACCTCGGCCACCCAAAGTGCTGGGATTACAGG 

CG7AAGCCACCA7GCCCAGCCCA7G7G7GGAAA7C7TC7G777A7CCC77 

-AGGCTTGATTCTTATGTCGTTCTCCTCCCTCCTTCCTGGATACTCCrCT 

7G77C777A7C7TAC7C7AC77G7CA7G77ACC77G777C7GC77A7AAC 

-AGCTC-CCTCTCCTATCTGAGGAGGGACTTGTGACTGTTCTCATCTCTGT 

ACTCGCAGGTCCTAGTACATAGCGCTTGCTCAACAGATGTTTGGTGCATr 

GATAGATAAATCACTGGTAGCTGTTACTACCAGTCCTGACTCCCTGCAGT 

GCTT CAGCTGAT C CTGTTCCAGATGTGCACTGAATATCCTTCTGTTGAAC 

AACAGAAA7AAAGGGGA7GGG7GAGGAGGA7AG7C77CGG7GGCCAAGGA 

TATTTTTAGGTACTTTGCAGCACTCAGCAATGAGGAGTGGGC7TTAGTCC 

CCCAAGAAC7C7CACAGCCC7GGG7G7C777AC7G77CAG7G7CAAA7CC 

AAGACAAGTCAATGATCAGGAAAGACCATTTTTTTTTGTTCAGTGAAGTT 

TATTTGAGAATCATTGAACAGTATGATATTTGGTAATTTCATAAATATTC 

CCACTTAAAATGATCGGAGCAGATATATTTTCAGTCGTAATTAAAGGACA 

TGATTTAAAGAGAGCACACCAGTCCAAATTGAAATGATTCCA TAGCT ATT 

AAAAAACTAGGGTTITTTACAGACAATGATACTTTTTGCCCCCTTTGAAT 

AGATTAGACCAATGAATAAAACAAACAAACAAATAAATAAATAAATAGGG 

AAGCGGTTGCTCATCAGAATGTGGGAGCGAATGACAGAGGGTTTCTTAGA 

^CCAAATGTGGCCGTGGTTTCTGTCAGGCGTGCTTTAAGTGAGTAGGAGA . 

GGTGAGAGAGGCCTGGC7CAACAAAAGGGCTGGGGATTGTCCCTGAAGAA 

CCAGAGCTGANTTNCATCAGGAGTAACANAGGTAGATAG 

>Concis41 

C C GC G77GAGG77 CCACGCAG77CAAA77A7G7 CCAATTAT CAACAX7AA 

~GCACA7777CAA7AGAACC7G77CCGGC7777C77AGGAGGGGGGCGGG 

GAGACG77G77C7C7GGGAA7AAG7G7ACGCAGGAGGC7GAGAAGGC77C 

ATTCGATAGCATTCACTTACCTCCAGC7GTAGAG7GGGC7TATCATCT7T 

CAACACGCAGGACAGG7ACAGA7TT7777CTT7GAGGCCCAAGGCCACAG 

G7A7777G7CA77AC7TTC77C7CC77G7ACAAAGGACA7GGAGAACACC 

ACTGAAGAAAGAAGGGGG7CT7G7GGT7AGGGACACAGCAG7GCAGGG7C 

ACCCCAACCCCTAGGCCCCATGAG7AGGA7ACATG7AA7TTGG7AGCC7C 

7G7GGGAACCCACAG7GAGG7TCCT7GGCC7AAGACACAGGA7AAC7TGA 

CTTC7CACAGACAA7AGCAGGG7CA7777G77GA77TAGGG7TTCCCC7C 

AAAGGCC7GAGGG777C7CAGAGCC7CA7AGCAG7AGGAACGGAGAA7GA 

AAGAGGG7C7ACA7777AAA7GC7GAAGGAAGGAAGGAAGGAAGCCA77G 

7G7CAC7GGC7GGCAA7G7GCCCA7CCACAGGAGCGGAACAAC77GA7CA 

A7G7GGAAGGAAAGGAAAGAGG7GAGGC7G7AC7TC7GCCAGAAA7CAGG 

-ACCAGAAC7G7T7CAGGAACAGAGAG7AGCCCA7GGGAAGAAAC7GGGA 

"AGGAGAGGC7GAGC7GGGAAAG7GGC7CCAAAGAGAGACAC7CA77TTG 

VTC" C7CAG7CACAGCAG7G7CAA77GGAGGCCC7GGGA7CAC7C77A 

"tACC CGA77CCAAAGAAACAGGA7TT7C77GGCC7GGC7GAGAGCAAA7 

AGC~ r ^CCCC7GAG7GAGGC7G7CC77CAAAG7CAGCAGCCTTAG77GCC 

CACAC7CC7G7GCAGAGGCT7TGGC7AC7G7GGCACGA7GCCAGGCAGA7 

CACCACAGC7AA7GATGGGT7CACCGCAC77GAAAC7TTTGCCCG77ACA 

GCGGAGAGA7A7AAGT7CCTGCTGGGCGG7AAAA777CCC7ACAAGGAAC 

CACC7GGCA77GGGTGGGACGGA7G77GGGGCAAGGGGGGAAGAC7GGGG 

AGGGGGA7GGACACA77ATCGCTCCAGCAC7C77G7T7CAGCC7CAACAA 

CAGGAAGAGAGAACCCACAGGCAG77AGGCCA7G7CCA7CAAA7GACCCC 

A7A77G7GGAAGAA7TGACA7TGCAC7A7GCCCAAGAGAC77GGG7GGAC 

A7GG7C C7GGGAG7GC7TGAGCCG7C7AA777CTCAGGG7CACAC7CCTG 

77AACAAA7GCAC7GGCCAG7GCAA7CAAA7G7GCCA777C7AGGACCAA 

AG77TG7A7A77CC77777AA7A77777777CAC77G7G77GA7CAT77G 

CC' T ^AAA77AAC777C7ACT77G777AAAACA7GGAGAA7TAGCAAGC7G 

CCAGGAGGCCAGGCAGGGAAACCAGGA7GT77CCA7TTACC77GTTGC7C 

CA7A7CC7G7CCC7GGAGG7GGAGAGC777CAG77CA7A7GGACCAGACA 

^CACCAAGC777T77GC7G7GAG7CCCGGAGCG7GCAG7TCAG7GA7CG7 

ACAGG7GCA7CG7GCACA7AAGC77CG77A7CCCATG7G7CGAAGAAGA7 

A GG7~~~GAAA7G7GGAGCACA7G77G777AGG7A7AAAA7CAGAAGGGC 

AGGCC I; G7GAGGCGAGG7GGCAAAA777GA777C7TGGAGGACACC7GA 

3CA7A7ACGG7CAAAG7C7GA7GACAACACCAG7AGGGA7GAAGC7GGGA 
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^-^^^^TTCAAAACCAGCCTGGCCAACATGGCGAAACCCC 

GGT^w^G^A x uuAGTAAGCGAAGATTGTACC&CTGr zrTr r rrw*/^ 



i^^^:-i AC7 ^^^^CAGCAGGTTAGAGGCAGACTCAAGAT 
I^^JHZ?~SZpZ^^^^^^^^^^^^"^^AGAATAAA^C 
« * AC. .TGTTCATCTTTTTTGTACATGCCCCACC"*ACACCATAr 

A EE?);- * * * - wATGCAG ATGACCATAACATTTTCCATTCACCTATGCr , C 

^I^^attcaatttttcta^ 

TAACACTGTCTCATAGGCATTCTGCAAATCCTGTGAGAGTACTTTTTGTG 
CTGAGGTTGCCTTAAGGCATfiaTa iTwrrri » a^^^CC^. IZ2 



^^S^^^^^CTCCTCCCACAGCTGCAGCCAACAAGTTATGCC 
^SJ^'^^^^^^^^A^TATGTTTTTAACAAGATTGAGGACTGGA 
iBRr^~-"-~-^ ^^^^^^^^^^^A^TAAAGCCAGAG 
^^^^^C^C^CCTGCACAACTGACCTAGCTAGGCTGATGGC 
; Zlt trrr* - rtGGAAGGcrA CTGAGCATCATATAAAACAGAAGGGACAGC 
^^:-"-- AA ^ TGGCTCTTTGTAAGG ATGAGTCrGAAAAATGACCATTT 



77 A ^S^^^ G7GA ^ GG ^^ AGGG ^G A ATCCATCTCTTAAAAGGATA^TC 
A ^^^I A ^ G ^ GAAGAG ^ GG ^ G ^ G ^CCTACTATGAAATGGGAAACAT 
^?E^ CTACTCCTCCCCTGTCACG ACCAAGTG7GGCCACCACCACa^CG 
G ^ GAC7G7GG7GA ^ A ^ GA ^ GA ^CAAGTGGCCAGGTaVGCAAGT 



- ww- * w - -nnftUL i w<-<> i wAAACACCTGGCAAGTTTCACAGTGATATGCG 
5^^^ GTC ^ GAAGGCAGATTCT AGGCCTGGCAGGTGGGCACCCTGGG 
; HEH?^ 3GATC7TGAGGCC7AACC *' G ' rAGC ^ c AGCAGAGTCAGCT 
AAAA a. C . -•AGCTCTCCCTCTCCCTCCAAGCCACACTTTGCAAAGGGATTC 
C . . GTATTGTGGGCTTGGAATCrTTTCTCCCCATTTjCCrCTGCAGGAAG 
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-CC77GCAACAACACA 7GGA7AGCC7CCAGG7CCCAAGGC7GGAGG 

~77G7AATGGGAAAG7AG7C777AAATCAGA777AC77GGCACCC7G777 

3CCAC7GAAAGAGGCAA777AGGGGAAAAA7C7GG7C7CCAAGCACAGA7 

AACACTCTACTCTTGAAAGAGGAGACCTGCTCATGTTACTGGTCTCAGCG 

~CT-CAC7GACC7G7AA7AAGCCATCA777CAC7GGCGAGC7CAGG7AC7 

-CTGCCATGGCTGCTTCAGACACCTGTGTAAAAAGGAGAAAATGAGTGAC 

-7CCCCA7GACGGC7ACG77CA7G7G7GA777C7C7CAGCA7CGAG7GCA 

-GGCAG7CATGCAAAGAAA7GA7C7C7GAG7AAA7GAA7GAATG7G7GAA 

AGAGAAG7CC777GGG7C7AGAGAAAAGCA777GC7AAACCAAACCCCAA 

~TAGCAATG7A77GGC7AGGAGAGC7GGAGCAGAGGC777GACAC7AACC 

TTTAGGGTGTCAGCTGTTAGATAAGCAGTATCCATTCCCAGAATATTTCC 

CGAG7CATAAGCATTATATTACACCTGGCATTTTTGCAAAAAGCTGAGAG 

AGGGAGGCAGAGAGGGAAGGAGAGGGAGAGACAGAGAAAGAAAGAGAGAG 

AGAGAGAGAATATGCATACACACAAAGAGGCAGAGAGACAGAGAGACTCC 

CTTAGCACCTAGTTGTAAGGAAGATTAAAGTCATACTTGAGCAATGAAGA 

TTGGCTGAAGAGAATCCCAGAGCAGCCTGTTGTGCCTTGTGCCTCGAAGA 

GGTTTGGTATCTGCCAGTTTCTCCCTCGCTGTTTTTATAGCTTTCAAAAG 

CAGAAGTAGGAGGCTGAGAAATTTCTCTGTTGAATACCTGATTTCACAAT 

CAAGTTAAAGGAAAGGGGAAAAGAGTATTGGTGGAAGCT7CTTAGGGGAG 

GGGACTAATAAACTGAGATAATTCTCTGGTTCATGGAAGGGCAAGGAGTA 

GCAAACTATGACACATTTTGCAAATGTATCACCATGCAAATATGCATTG7 

■"^TCCTGACAATCGTTGTGCAGTTGATGTCCACATTAAAATACTGGATTT 

~CCCACG77AGAAGAA7G777AAA77TAG7A7A7G7GGGACAAAG7GGAA 

GACACACAGATTrATACATGCACATACTTTTCTTCATTCACTTCTTTGTA 

CTAAGTTrAGGAATCTTCCCACTTACAGATGGATAAATGGGTACAATGA 

AGGGCCAATAGCCCTCCCTGTCTGTATTGAGGGTGTGGGTCTC7ACC7TG 

GGTGCTG7TCTCTGCCTCGGGAGCTCTCTGTCAATTGCAGGAGCCTCTGA 

GGAGAAAATTGACC7TTCTTGGCTGGGGCAGAGAACATACGGTATGCAGG 

GTTCAGGCTCCTGACGGAGTTGGGGCAACCCTGGAGATAAGCTCACACAA 

CCCTGCAAGACCAGGTGCTGTTACCCTAGCCAATCTCATGGATGAACCAG 

ATCAATGCCAGATGAGCTCTGCCTAAAATGATTTTTTGGTGAACTCTGAA 

AAGTGGAATATTGTTTCTGTAAGAATATCCATCTGAGACTCTATCTCTTG 

GTAATACCAAGAGTTATCAGTTTCTCTTTAACCGAGACACCAGCAAAGTG 

CCTGC7CCAGGGTAATGCCCAGGGGAGCCCTCCATTTGTAGAATGAATGA 

GAG7CCAGGT7A7GAACAGTGCC7GGAG7G7AGGAACACCC7CC777GCC 

^C77^GACAGG7C7GCA7CA7AACAC 1"1"1"1 111111 1 1777GAGACAGAG 

^C7CACT*C7G7CGCCCAGGCTGGAG7GCAG7GGCACGA7C7CGGCCCCC7 

3CAAG77CCGCC7CCCGGGT7CACACCA77C7CC7GCC7CAGCC7CCCCA 

"-CAGC7GGGAC7ACAGGCACC7GCCGCCACGCCCGGC7AA777777G7A7 

:1 -~r7AG7AGAGACAGGG777CACCA7GT7AGCCAGGA7GG7C7CGA7C7C 

CTGACC77G7GA7C7GCCCGCCTCGGCCTCCCAAAGTG77GGGA77ACAG 

GCG7GAGCCACCG7G7CCAGCCTGTAACAC77CTTATAGCAC7GAG77GA 

AACC^GC7CCrCCTGG7TCCTCCAGGAAACTGAAATCT7777GAGCCAA 

G7C7AGCACAG7GCCTGGCATGTACATTCAGG7GGTAGAG777GC7GC7T 

GAAXGGG7GAA7GGGAA7TTGACAGCAT7TTTAT7CAAA77AGTATGTGC 

CAGG7A7CG7GC7CGCTCTGCATTATCCAASGGAGTGAGCCTCTGTGCAA 

GTA777GAGACACGAGGGAAATAGG77CTACTGTGGGAAAAAGAGCATTr 

CA7GGACTTGC7CTCCAAGGAGCCTTCTGATTTTTAATTTGGC7CCCAG7 

A7C' T TGA7A7CAGGAGTCAG7CACAAGAAC7CCA7C7TTAG7AAG7TATA 

77TTCCACAGGAAA7C7AAAAGC7GTrCAACA7GTTAGT77CCrGTGAAT 

77GA7AAGCCA7AA7CCAT7CCTAACAC7GAGCCCTCCTGAAAT7TGG7G 

-C7GG7CC7GCAGA7AGCTAAAAGCCC7G7C7GGGTGGCC7AGGGACTCC 

TC7G7777GCC7CCACAGGA7CCAC777GCAAA77AACCAC7GG77CTCC 

CGT7G7AGGAAC7GCCACCT7CC7CAGAGCC7G7C7T7C77CC77CC77C 

CTTCCTTCCT a ' m 1 1 1 1 11 1 1 I CTCTCICTCTTTC mC 1 1 1 1 Ll 1 1 1 
C7TTC77^C777C777CITrCTITCTr7CTT7C777C777C777C777C7 
, cc — c ^ TCTCTT rc7C7C7T7CTC7CT77C7C7TrCTrrC777C7C7 
- c ^--^cCC7C7C7C7C777C777C77777C7*r7C7777C7C7777 
I^^ c i^ c ^^ TcrrrcTC CC7CCC7C7C7C7C777TTCTTTG7C7C7CC 
Z**;^-^^;~^ C7CTrrC 7C777CTC7C7C7C7C7C7CC7AGACAGGA 
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i^°S^I^^ TCGAGATATG ^CK3ACCCGAGCATCMC^ 

;S AG ^"-- AC ^^ CCATOTC ^" G7AG< ^TCCAGGACCTCAGAAT 

CACAGAGAx^AAGACATGAATAGCTATTTGAATGTGAACAGCAGACGAAG 
AAATCAAGGCTAGGAGGRTnna artiv-a n*>r>*m£Zt 7^7^. Zv„__V A _ 




Arrraaa^~rr7Ao^^:^i ^^^^CCCTGATGCTTTCGCTCG 

CAGTTCTGCCATGTCTCATCCTGGCCCTGTAACCTGGACCCAAATCTGCT 

ACCATCCCATCCATCTCAGGAAGTGAAACCTCTTATGTCAAATAGGTTGT 
GCAACGTATGTATCAGATrrTrtTrt-rr'^ii » r-^n ^« . 




; * -wviwviuwwnurtiouyuhbL 1 vjOHAAI uLL-AG 

GATGCTCCAGCTTTTGGGGAATTATTCAGCTCTTGAGTCACTAAAGCCTT 
TCTCXGCTGCAAGTTCCTCTTTACCCTGTG^GGTCATTCTTCCAAGACAG 




-AATv_aTT . -GAACAAGAAGACAAGCAAAATAATCATGGGTTAGT-'' 
rTATATTATTGTGTGTACATGCAGTGATGTCTGTTCTTTGTAGTGAGC'G 

TTCCTTCCrr?GTTCACCCTCTTGCTTAGAACAGAACTAAGCAATCTGCCC 
& pi ttt""^^^^* * — — — _ _ 




1 1 ii_ i iUiAXCATTT CCATGAGTCCCTCTGG 
GATCTTAAAGTATGAAAAATGTTGTGTGTACCCACACCTGTCTTTGTGGA 
TATTTCTCTCCTTTCCCTrCTGCTTCTGGGATTATTTGGGAATGGGCACT 
ATGATTTTTATCATATCGCTTCCACTTCCTTTATGGCATCATCTCCAATG 
GGC I i u i i CTCCCTCTTCX5ATCCAGGTTCTCAGATTGGGGACATGCAGAG 
TCCAAGGAACATTCCATTCTCCTCCCTGGTCTAGAACAAGGAGGGCTTAG 
ATATATGAGCAGGTGGCTCGGGCTGGCGAGCTATGTAGTCTCCAATGGCT 
TTrCCCTGATGTCGGAGTTGTTATGTCAGTTCTGGGAGACCAATAAGACC 
TTGTCCTTCCTTTGGATCCATCAGAAAAAGCCCCTGGGTGGGTAAGATGG 
ATGGCAGGGCTCTCCTACrCTATGTCTTTTCTCACACCTAGTGGGTATAA 
3AGAGGGGACCACAAACAGAGGGGGCTCTGGTACCACTTATCCAGGGTCT 
3GAAACATTrrCTGTAAAGGGCCAGATAATAAATGTTTCAGGTACAAC"A 
ZTCAACCTTGCATCA7TTCAGAAAAGCAGTCAGATAATACATAAATGAAT 
3GGTGTGGCTGGACTTG7CCTGCGGTCCCCTGTCTTATATCATTGTATTA 
TATCATTTTTTCTTACATACAAATTTAGAAGCAATACTTAAAAAAAAAAA 
GCCGTCCTTTATTGAGCACCTACTAAGTGCCAGGTACCTTTTTTTCCCTC 
ATTATCTTATTAACTCrrCATAATAACCTTTAAAGTAGATAATATTGAAC 
CATTTGACCTATGCAGAAACTGAGGTTGAGACAATAAATTATTTAAGACC 
GCACAAACAGTAAWGCTGGAACTACGACTaUU^TATGGGTTAACTGAAC 
CAAAA.CCAGATCTTTATTTCTCACTTTTAATTGTTACATATGTTTATTGC 
CTCATCTCCTGTCCACATGGTGCCCATCGGCAGACTCCTTTCTCATTCTC 
AGTGATTGAGTGACATTCTAAACTACATTGGCCTGGCAGATTCACCTCTG 
TCCCCTAAATGTTTCCACATTGTCCTTTTAGGATTGAGATCCTCTCTGTT 
CCCTTGTCTTCCCTCCTrTCTTCTTCTGGCGGTGACGTGCTGTGTGAATT 
TGTTTCTTTCrCCTCTCAGGGTAGTACTGGGACTTTCCAAATCAGGGTTT 
TTAGTGATCTCTCTTCCCTTTTCTGAGTTTCrTCCTTATTCCCATTCACT 
TTCTCATCTATAAGTG GCAG CTTTGTTGCTGGAGGATTTCCTTTGTCCTT 
TTATTCTTCTTTAAGACTTTGTCATAACTGTCAAAAGCAATCCCTTGAAG 
GTATCTGTCCTTGGAATTGTGTGCTTATGATGCTGAAAAATACTCTCTTC 
CTAAAGCTATTATAAATGCT 
>Contig42 

GGCTAGCTGCAACTCTTGAATACAAACACATTCAGACATGCACACACTTT 
CTGGCTCCCAAAAAGAAAAAAAAAAATCAATTTATAATAATTCTGATC'"^ 
rTGCTTATTTCCACAAACTCCATGAAAATTGTACATTGTCOVAGCAACAT 
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TCTTAATATTCTCTTi. . 7C7C7CA7A7CCA7777CC77AC7GC7G7C - J 

ZACCTATCTCTTCCAAACTCCCTGTTAAAATCCCTGCCCCAGCGAACTTT 

-A777AA7777G7GGAA7GGAGGC7GCAC7GA777AAA7TAAA AAAAA AA 

AAAAAATCCCTACTCCATGTCCCAGATCCCTAGTTGTTTTTTGTTTTTTG 

— TCC7GAGACAGGGTC77G7G7C77CCA7GC7GGAG7GCAG7GGCA7G 

A7CA7GGC7CAC7GCAGCC7CAACC7CCTGGGC7CAAG7AA77C7CTTGC 

C7CAGCC7 ZZ CCAGTAGCTGGGAGTTCAGGTATG7GC7AC CATGC CTAGC 

""AA7777777C777TAT7T7G7AGAGACACGG7C77GCCAGG77GCCCAG 

GC7GG7CTAGAACCCC7GGGCGGACG7GA7CCGCC7GCC7CGGCC7CCCA 

AAG7GC7GGGA77ACAGGCGTGAGCCAC7GCTCCCGGCCT7GGGTGCAAA 

""TTGAGCT77C7CAC7TATTAG7G7AAGACATACAGC7AAT7TC7AAATC 

; TCCAAACC7CAGATrTTTCATCCATGAAGTGAGGA7TA7TA7AGAGCTC 

AC7AA7AACA7GGCTTCAAAAA7A7A7AATGCCAAAAT7GAGA7CAAAA7 

AA7AAA7C7A7A77ACA7GGGAGA7C77AA7G7ACC7C77A7A77A77GA 

-AGAC7AAGA7GATCAAAAAAA7AGAAAGAGAGCAG7AAGGAGAGCAAGC 

A7T7AA7CAATAGGACCAATACATTTTAATCAATAGGA7CCXCAGGAATA 

" r A7ACAGAA7ACCAAACC7AACAAC7GCAGAAAACATGCCAAACATTTAG 

GTACAGACATTGTTGGAAAA7GCAATCTTGAAACGAGTGGACTGACATTC 

AGAAGA7A77AATAAGAGCACTAATGATGGGGA7TGCAACCA7GTC7T7A 

C7GAC77CCAGAAGCrTCTTACAG7AAACA7GAAATCACA7AA7TrCTrC 

CAC^r7CC7AC7GTTTCT7GTTCTGGGCTC7GTCC7GC77AC7GTC7AAT 

ATC~7GGCCCCTTAAAAGTTGCTAA7CTTCCAAACCTCA77CC7G7GAC7 

GGGCCGC7GG7CCrTGTTCA7GGGCC7TGAAAA7AC7GAC7G7ACAC77A 

-C7GGAGCA7CCAGTGCCTACCACC7GACCCAGAT7CC7CA77GCGC7CC 

rcC""CC7CCACCTATTGGAATT7GCTCA7ACCCG7G7GAGACCCC7CCC 

77TCCCCCCA7C7GAATTT7TATCAAGACAACGCACTGCCATAC7CCC7C 

G7AC C77GC7CTGGGCATCAGAC7GAA7G7T7G777CCATTGAGGA7CTG 

CAGC7GCA7CAG7TTCCCCAGCACCGTCCAACCCCTTGAGCA7GGC7AG7 

CCTAAAGCAGAGAATTAGCCT7TCTATCCCTGC7GCTATACATGCTGGGA 

r&AATAATAAGAAATGACAGCATT77ATGATAATGCAGGCTGCAG GAGGC 

AGGAGGCAGGAATCAAATTCGTGCrTATCAAATAGTGCTCCAATTCTTTG 

AATA77GGAC7A7AGAA7ATG7CATGGATCTATGCTCAGG7GGGTTCCC7 

A77AC7CAC7CCAC7GAGGCCAGG77G7GGGA77AGC7G7CCAAGAGGGA 




jTwTuwVjHCU 1 ^Ui*a*\V3«/*wvarw* ww * * w>* * * - — 

GGAGGAGAGAAG7GGCAGGATGCCCAGCCCCACAATCAGAGGGGAAGGGG 
CAGAGCCACA7GTATGAAGATCC7C7CCCCAG7ACG7GCCAA7CACAGGG 
-T^-^~AGC"~^7GGGCCAAGGAAACAA7G7GGGAAGCAAAAAAGGACAA 
-^;Z^-^-- :rrTrGCATGAAGACT G A GCAG777TACCAGA77CCCAGG 
GAAACACCC77CCACTCTGGGTTGAA7G7GAG7GAGAGACAT7CAGC7GG 
AACAC7AGAAAAACrATTTCCTGAGCGACTCACC7TTAGCCC7AGAAAG7 
G77GGA77TGTCCTTCATCTTTGCCACAGTAGAGACTGCTGATAGCATCA 
GAAC77GGGCTCTGGAATTAGACAGATATGGGTACAAATCTGAGCTCTC7 
CAC77A7TAGTGTGGGATCTAGAGCAACTTTTAAAATCCTTCCAAACCTC 
AGAC77C7CATGCATGATGTGAGGATTGTAATAGGGCCCACCTAATAGGG 
GTTTT7GAGAATTAAAAAAGTTATTG^TGRACAGCATTTAGCAAGATGC 
C7GACCA77GAGAAAA7AACAAA77G777A77A77A77G77A77A77AAA 
CATCT7TCCTGCACCTTCTGACTGGGGGCATCG7A7CATCAGAAATACTT 
AGGA7GGGA7GGATTCCTGCATGGGC7GAGTCAAGGGTGCAATAATGGAG 
GAGTGAAGAAGGAAGAAATGGAGGCAGAAATCCCCAGGAGCCCAGCATGG 
TACAAGGC7GAGCTAGTGC7GCAGAGCCTCCTTGGAACAGCCACAGAGCT 
7GCA7C^GGCCCTGGGAGGAACCTCTTCTAGCTGGCAGGACCAGCCACAA 
CAG7GGCCAGGGGATTTCCaVGGGC37GGGCTCC7AGGAG7TCATTTGGA 
CCAAGCCTGCCTGGAGAGGGGTTATAACAGGGATCCTTCCCTACTGGCAG 
GTGA777ACCCCTCGGTGAGAAGC7CAGGCATT7GTTTGATGGAAGGTGG 
AAGGCCC7G7GCTGGGCCAGTGAC7ATCAGGGA7GGGCGGGTGGCTGGAA 
AA7AGCAAA7AAGACAATATGA7AACACAG77AACCACCACAC7A7GTGA 




:C7CAC7G7GGAATGTCC7CC7GG7C7CC7CA7GCCCAGAGAGTGG 
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^TACTCC7AC££.. .CACCGGCTTTCCTGTCATCTCCCTGCAGCC„> 

" t~ ' - G p* GTGGT ACC7GCACACAGG7A7TGG7G7CC77G7C7CACC 
ACCC7ACA7CAC7G7AAGC7CCCCAGGAGCAGGC7" r CC7G7TTGACTCAC 




TTCTCAATTTTTTATATTrTTTTAATTTTTTAAATTTTTTAfTTT 
TATTTTTATTTATTTATTTATTTTTAA I T T T 1 ' I I' TTA ATTTTTTAAATTA 
TGCTTTAAGTTTTAGGGTACATGTGCACATTGTGCAGGTTAGTTACATAC 

gcatacatgcgccatgctggtgcgctgcacccactaactcgtcatctagc 
attaggtatatctcccagtgctatccctcccccctccccccaccccacaa 
cagtccccagaatgtgatgttccccttcctgtgtccatgtgatctcattg 

AATTTCTTTAAAGGTGGAATCTCTCAGTGGGGTCTAATCTGTTCAGAAAT 
ATCAAAAGAGTATCCTTGGGAATGACTGGAATTCCAGAGTCATCTGGTAA 
TCCTCATAAAACAACTCCTGGATGTCTCTCAGCACATCTCCCACC'~ T 'GAA 




- * * * ww^w An nnu j. ^v-aaaaooooa i At UTTTCATGTAAATAAATCA 

ACTGCAAATCGCTAGTTATGCTGAGCCCTGTCCCGTGCTGTGGACACAAA 

GGAACCAAAGGCTTTTCTCCCCGCCCAACACACACATAACACACACACAA 

AATCATAAAAACATACATACCCCCAACACATAACAACACACAACACACAC 

ACAAAATATATACACACAACACACACCAAACATGCCCACAAACCTGTGTC 
CAGA.GAT AGATCTTt CTrVTrrv?m~rTT±'r>r*r*T'r'Ks«r- » ,i m . .i >,. . . 




CAGC7TTTACTTCC7TTTGGCCCCTCCCATGTTCTGTTTCCATCCTATCA 
GAGTGCCCTTTTTTCAATCCTCCCTGTGATTGGCTACTTTTAGAATCCTG 
CTGATTGGTGCATTTTACAGAGTGCTGATTGGTGCGTTTTACAATCCCCT 
-G7AAGACAGAAAAGTTCC7GATTGGTG7G7T77ACAA7CC7CTTGTAAG 
ACAGAAAAGTTCCCCAAGTCCCCAC7GGACCCAGGAAGTCCACGTGGCCT 
CACCT7TCAACTCCATAATGGCATGAAAATACATATGTTGTACAAAACAT 
ACATACACAAAGTATACATGCATCTCCCCAAATATACACATACCACAGAA 
ACATACACACAGGAACTCAGCTACCTGTCAAAAGTCTGCATGGTGATTGC 
CTCTGCAGTGAGTaGTTAGAAAAGTGAATTTGTTTTTCAATAAATTGGAG 
TCC7TAAAAATC GTTGT AAGATAGAAAATTTTTAAAAGTATATAAAATAA 
AA7ATGTA7G7CC777GGTC7AGCA777ACACA7G7AGGAA777A7CC7A 




gtttatgaatactccatactacactaggtagcaccccctattaaagacaa 

actcttctctctcatttcccttcctttccggaaccacttggttgaatctc 

tacaagtctctattgcaactgcctcaacatggcaccctccctgcatctcc 

atcttccctg7cctgagagcaatggcctgctgcccccacactcacatcct 

cattcattccagaagtgagcaccacagaagtgcctacagttaccccaacc 

accttcttagaagataagttagtgtttgttttgactttttaaaatt:^ 

cttcctcrrttccttcacaatctcatcccatcccaagaggtttatcaaga 

agttctctaaagatatgtg7ctccttatggaat7taacagaaatcaggga 
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. . . o i. a . - - . AUCLA. v iGGG AAT AACATTTTT C CAGGT CTTZAGAC 
A7AA7GGAA7ACC77GCAG7AA7TAGA7ACACTA77G7AGAAAAG7ATTG 
A7GAAA73GAACGA7G777GAGA7A7CA7A77GAG7AGAAAAGGCAAGA7 
ACATTAAGTAGGAAATGTATCTTACAAAATAATTTGTCAGACACACTCCT 
ATA77737A7G77A7A7AAA7GCG7A7G7GAAGAAAGGC7AGAGGA7GAG 
ACCACAG7C77CGG7GAAG777AAGAGA7GA7GC7GCAGCA7GC7CAGAA 
AGGCTTGGTATAGTTTTTTCCAGTAATTAAGGACTGATCTTAGGTAAATT 
GTCCATCCTCTCTAAACTGCACCACCTTTTG7CTGTAAAACAGGAAGGAT 
SG7A777ACCCCCAGGG7CA7CAAAGGA777GGT7GGAGAAAAA7AAATA 
AATGGGCTGAGCCCAGACCTGGCACAGTGAGAGCACAGTGGTTGACTATT 
GTGCTGGCCTGTTGTTCCTGTGTTATTGACATGCTGCTGGTGG7GGTCCA 
GAAGC7ATTACCTTAATTGGTTATGTGGATTTCCCCTCATACTGAGCAGC 
TG7G7G7GG7G77G7AAAACATAGCCA7ACACAGTAACTGACAAGGGCAA 
A7G7GA7GGAAAAA7GCAAGGAAG7GCAGA7AAA7AGCTA A7GGG C7G7A 
GAAGGAAGC7AG7CC77GGAGGGC77GATCAAGGAAGGTCCTTTTGCA7G 
7CACC777GAAGAAGAGGGGACA7AGAAGAGG7A7AG7GCA7CCCGGAG7 
G7ACC7GGAAGGGAACA7GAAAAGAGGACATTTTTC7CTGGt5ACA7GGGG 
ACTCCAC77GCA7GAAC7CTGGAAT7GGGGCAAAGAACCA7CATGAGAAC 
AAGGGC77CC77GAACCTCCCAGGC7CATTGGC7GA7C7AAACCC7G7G7 
CCCC7C777CC77CAC7CTCC7CTG7TTrCTA7ACC7G7A77A77GGAC7 
GGAC7GGAAGCCACCTGATCTATCACAAGTACCTTGAAA7G7G7TGAA7A 
GG7G7GGCACAGTCC77AGCAGAG7GGCAC7ACCCCCACAGGAA777G77 
""ATACC777GGCA7GGAAAA7AGCAGGAAA7GAG7GA7CAC7GA7AACTG 
AGGA7GC7A777ATrA77GGCCAAAGGAA7AC77G7G77G7A777GCA7A 
ACCAC7CACAAAC7G77GA77ACAAA7GAG7ACCAGACC7AGC7CCTTCA 
AG7AAAGGA7C77GAGAAC7GAAGGCAAACAGAGC7CCAGGAG7CCAAGA 
CAGAGCCACAGACCACGAGGA7CCC7GGCCCAGG7AGGTGG7CC7CCTGC 
ACTGGC777CAAGGCCAA(»GGA7GGA7GGXjGAAG7AGAG7AGCA7CTGG 
CCATC7AGACCCTrGC77T77ATCCCCACrGGAAGCACATCTGAA7TTCT 
AAA7ATGA7C7CTGAGACCTGCCCAGAACACCTTGCTCTCAGCCCCAGTA 
GCAGCC7GCTC7CTCCCAGGAGGGC77CCACTAACAAGTAGGGCATTGCT 
GGAGGGCCAGGCAGACACTAGCTTAGGAAA7CCACCAACCCTGGAAATGC 
7AGTCCC77C7CTGAAGGCTCAGAAGACTGACTr7AGAGTC7AGAAAATA 
7TGG7CC77GGGAACAGATTT7GAGTGCAAAGAGATGGACTTCAGATGGC 




CC7CCAGCC7G7GAGCA7CCTTC7G7CC7TCAGCAGCACCACAG7A7C77 
-A7A7G7 — T7GGATACC7ACG777C7GCCAGACATC7CTrGC7C7GA7G 
rrC73GC7GCCAAATrC7C7GTCAAGCGCC7CCAA7Tr777G7G7CCT:7 
GAT77ACCCCAACA7GACAAAGGCAGTTGTGC7TCA7G7A77CAGGGA7A 
CTGCCAAACCACAAACAGGTTAAAA7CAAATAGCAGA7A7CCC7G77CC7 
AAAGAC C CATCAGC7C7ACCCACC7GC7C C7GC7CAC CG7CC77A77G77 
GAG7CC7GAAGCCC7TCCTTGTCATTmATTTTTrGCA7GAACAATTTA 
G7TCCC7T7G7C7CAC7CCTAAACCTTTCTCAAAGGATTGGATTTGTACA 
CAAAC7GCCTATCTCTGCAATCTTAGAAG7GATATGA7TCT GAACAAA TC 
ACTrAACTrTTGATTTTTTATTGXSTAAGATSGGAATACCAATTTTTGCTC 




GCAGAGCAAC7GCTT7TTG7TAGGCAAAGAT7AGGCTAC7GCAGAGAC7C 
AGCAAAC77C7ATAGAAGG7GTCAGA7GG7AAG7A7T7TAGGC77TGCT7 
GCCAGA7GA7CTCTCAACTAGTTAACCA7GC7AT7GTAGCC7CGAAGCAG 
CCAGAGACGATCTGTAAACAAGAGCATGTAGTGTTGGCATAAATATAGTA 

CCGCG 

3CAA7AAG7C7ATlTAC7G7AAAGT7AATCAAAT77ACATTrCAGAACAC 
77AA7C7GCAAGAG7CC7T7CCAAGACCCTA7ACC7AAT77TG7G7T7AC 
AA7T7~A7A777GTr77CTTAAAGAAGACCACCAA7A7AAAC7A7ATCCA 
3CC'"~CA7GA7AAG7ACA7AAGAAAC7A7GCAAA7AAGGGGGAAAAAAAA 
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-?^"^n^^r?^^ iAAAT CwTCATTTrCTAAAGATAATT3TTAA 
ACAAGCACTAAACACCAGATJVftftaanrrs&rsarta » ^, ^^r?C 




i TTTTAAC x GGATTTTTGAAAAAGAAGATAAAAGTCTCAnTTAGTAATT 
^TTTGvjvjAGvjC - GAGGCAGGCAGATCACCTGAGGTTGGGAGTTCGAGACC 

agcctgaccaacacggac^ccccgtctctactaaaaatacaaaaSag 



cccagctgaggcctcgtgcccatgctaggatagactcgtccagacatgtc 
aggtggtctgacagggcaagcagcaggaagtcatgtatgagtatgaactg 
atctgtatgcaagggcggggagaacacgcggaggaatggggcgtgagaaa 
acagcacagtacgtttctttagcagctgtctctgctcagccatgggagtc 
accagagaaagaggcttggaggcgttattttcactgtgagatgtgagtg? 
aaaaaagtgcccaagacacagtgagtaccagggagatgccctctttccct 
acccgaatgcagaatggccacaggccttaaaacacacacatggttcctca 




wwww«uv.uiiv>iuv. a ««v»v.UATCATaAAGCTTCACAGGCAATGAGCTCT 

cagcaataacaggaacagtgcctgggggactgtagctgcaagaccgattt 
tcatgtaagatggcctctgaggactccgagatacaccaggctgagactag 
ctggcagctccaagttcttggtcagaagagaacaggaactagggaaattg 
gaattactg7tactacaattcctttacatccgcacaaccatgaggtccag 
agagtctctcttattttttttttaaagacagggtctcactctgtcgccca 

nrrTir.ar.T^irTr^Tffi'pni'r* ~. 



«««« i^Li^iiAiiiiii l i I rAAAGACAGGGTCTCACTCTGTCGCCCA 
GCC7AGAG7GCAC7GG7G7GA7CA7GG77CAG7ACAG7C77CACC7CCCA 
GGCTCAAGTGACCCTCCTGCCTCAGCCTCTCAAGTGGCTGGGACAGCAG7 
TGCA7GC7ACCAGGCC7GGC77777777T7TI I ■' 1 ■ 1 m 1 ■ ■ n ; :■■■■!■■■ 

TCGGTAGAGACTGGGTCTCTCTGTATTGCCCAGGCTAGTCTCGAAC^C^T 

GGGCTCAAGTGATCCTCTGGCCTCAGCCTCCCAAAGTGTTGGAAXTACAG 
GCATGAGACACTGCACCaVGrraftT&TB.rt , rr'T^rT*r?i » ~ 



— - ~*" « ww^w i \.\»v™ftAi\vj Hj 1 A wAAiTAUiG 

GCATGAGACACTGCACCCAGCCAGTATAGTCTTTTAACAGCTTTATTGAG 
G7J£GGC7AACA77GAAAAAACTAC^^ 

AATTTTGACA^TGTACACACQVGTGAAACTATCACTACAGTCAAAATAA 

TGAACATATCCATCACTCCCAATTTCCTCACGCCCCTTGGTAACCCCTCT 

CTCCCAACTCCCTGCCCCCTAACATCAGACftACTACTGATGCATTCTGTC 

TCCATAGGCTCATTTACATTTTCTAGAATTTTACATAAATAAAATGAa^G 

AGTA7ATACTCCTTCATGTATGGCTTCTTTCAGCCCAATTATGTCAAGAT 

TCATGC77ATGGCTGTGCGTATCCTTAGCCCATCTCTTTGTCTTGCTGAG 

TAGGATACCATTGCATAGACAGACCACAGCTTGCTCATCCATTCACTCTT 
GACAACGTTGAATTGTCTCTGTTTTTTrir a Kvnin* * * ™ * r*/*T*mn~m+ m 



GACAACG7TGAATTGTCTCTGTTTTTTGCAATGACAAATAAGGTTGCTAT 

GTACATTCCTGTATAGACATTTGTAAAAGCACAGCATTTCATTTCTCTTG 

GGTAAAGACCTAAAAGTGGAAAGGCTGAGTCATATGGTAAATATATATGT 

C7AAC77777AAGAAAC7GTCAAAC7G7TACCCAAAGGGA77G7ACAA77 

TTACATCCCCACCAGCAGTGTATGAAAATTCCCG7ACTTCCACATCCTCA 

CCAA7ATA7GG7G7GG7CAA7C77777AA7777GGACA7GN7AA7GAG7G 

CAAAA7GAGGCCCAGAG7G7C7GAAG77ACA777G7A7CC777TTGGCA7 
ecu aa Ar'afwrmvu. afiriTi^ii ~~*~*r*T™T**+ « * — 




TATGCCTTACTG7CAACAGGCACATACACATACAGACAGACAGGAAGGCA 
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3AGAGACAAGGCAGAGC . .TGATAAGAAGGTGACCTGGGCTCTAGCrCl _ 
GCCTATCACCTAGTAAAATATTAGTTAAGTAGCCATGAGTAACTCACTTA 




•"•ctatgagagaaacatattrccaagtatt tgatggagtaca tcagacac 
aaaggaaaggaaactgaatatttttgaggttttrrttttttaccaagaaa 
^cacattttgttaaattttcagaactacctcctgaggaaagtgtagctg 
zaCccatttagaatgatagaaaacatcaatctgtctgattccaaagccaa 
gttcrtgctacaacgagaaatgaaacaactggatccctacagatgcagag 
acctgggccccacaaatgtgaattctgttcccctaccgaatagagttaca 
gttccataatacagtactccctcacttttccacagtctcacattccacag 
-ttcagttacccacagtcaactgcaatccaaaaatattaatgaaaaattc 
caaamtaaacaattcagaagttttaaattgtgctccattctgagtagcg 
""gataaaatcttgtgccaccatcccacctgtccagcttatcgttagtcat 
-gacatcgtctgctcctgacatccaaccattgacatcatcatgactctat 

GATCCAGGATCACCGAAGCAGATGACCCTCCTTCTGACATATCATCAGGC 
CAATATCAGCCTAAACACTGCATCACTATGCCCACATCAGTCACCTCACT 
TCATCTCATCAAGGAGGCAATGGATCACCTCACATCATCACAAGAAGAAG 
AGTGGGTATAGAACAATAAGATAATTTTGGGGCAGGCATGGTGGCTCACG 




CAGGCAT~CAAAAC»IK:CTGGGAAA<-A 1 AV» l vjavjav-u i <-<- i. <- * w * 
AAAAAAAAATAAACAAAATTATCCAGATACAGTGGTGCATGCCTGTGGTC 
CCAGCTACTCAGGAGGCTAAAGTGGGAGGATCACTTGG7CCCAGGAGGTC 



3AGGCAGCAGTAAGCTGTGATCGTGCCACTGCACTCCAGCCTGGGCAATA 
AAGTGAGACCCTGTCTCAAAAAAAAAAGGTAATTTTGAGAAAGAGACCAC 
ATTCATACAACTTTTATTATAGTATATTG7TAGA ATTGTTCTATTTC ATT 

. , »„ .., mm mf~r~o** p t tt ' I 'I'I I T TTG 



ACTTATTGTTGTTAATTTCTTTCTTTGCCTA AT 111111111 X TTTTTTG 
AGTCGGAGTTTCACTCTTGTTGCCCAGGCTGTAGTGCAATGAGACGATCT 
CAGCTCACCGCAAATCCCGCCTCCCGGGTTCAAGTGATTCTCCTGCCTCA 
GCCTCCCGAGTAGCTGGGATTACAGGCGCCTGCCACQVTGCCCAGCTAAT 
TTTGTATTTTTAGTAGAGGCGGGGTTTCTCCATGTTGGTCAGGCTGGTCT 
CGAACTCCTGACCTCAGGTGAGGCCTCAGCCTCCTAAAGTGCTGGGATTA 
CAGGCTTGAGCCACTGCGCCTGGCCTCTTTGCCTAATTTATAAATTAAAC 




"*GGGTACTATCCACAGTTTCAGGCATTCAU-Hjaw**j\. i. *vj«~»w»wwww 
CTCCTCAGATGAGGGGGGACTACTGTCATCTCCTCAATCATTCTTGATTC 
AATCCTCAACACAAATGGTTTGGCCAGGTCTTGCCTCTGGAGACAAAATT 
-=CTAAGGATTTAGAGGGGAAAAAATG7AGTTCACTGGGAAAGTCACCTCT 
3C-~CAC7GGACAGCAACTTAAAACCCAGGCCATGACAAG7AGAAAGGCC 
\CC CACTCTCCTTCACACCTGGAGTATTCAGGAGTCAATCATATTTCA 




TCAGCTCCACAAGGGGCTTAAC5AAAU^eCTt-ii 3 vjva«j^sjxw* 4V ~"« 
AAGAGTTGGGGACACATCAGAAATGCCATCAAATTTCTAAGGGCTACCTC 
GTGGTGTCAGACCTGTGCATCTTCAAGGACMAAACAGATGGGATAAGCA 
GATGAGATTCACAGAGGACATCAAAATATTGGCrCCCCAG^GG^GAAC 
^TTCTAGTAACAGAGCTGCCCAGCTGCAGAGTGGACTGTTTCACAAAGCA 
ACAGGTGCCCTCCCTCTTGAATCACCATCTTCACAGGAATGCAGTAGAAG 
. — */^<»<^r?i^r»TV'a&rt&aKarMTTifV»^A.GGGAAACAGCTCCA 



CCTTTG7TTAATGAAACTAAAGAGCTGGGACAGGAAATGCCAAATTAAAT 
T AATAGAGCCTTGCTTTAAGACAATGCAAGTGGATGGTAATGAAGGAATG 
AGTCTTAGGCCTTGGATCAACCGTATTAAGCAATGCTGAGCATGGAGCCA 




|^GGCACCTACn»CaUUUUlGCTGUJ^W*v.UAV.v^ 
GAGAATATGCTAAQUVTAAAAAGTTGAACACCCTG^ATAAAAAAGGGTAA 
AAGTAATTAATAGAAAATTACTGAAAGCTTTTTTGAAACCAAAMTAGTC 
AGCATTGGTAAAAGTCTACAAAAGTGGACACTTTCATATAATGTTGGCAG 
GAGGGTAAAAAGACATAACCTTTTTGGAGGACAATTTGGCAA^GAGTAC 
"AAAAACCTTACAATTGAAGAGAACTTTGGCCTGAGTGCAGTGGCTCACA 
CCTGTAATGCCAACACTTTGGAAGGCCAAGGTGGGAGGATTGCTTGAGCC 
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S^tSII;^™ ^^'GGGGTAAttCAGTAAGACCTCGTCTCTATu ' 

AAAAA7AAGAAAAG7TAGCTGGGCA7GGTGGCA7G7GCC7G7GGTCCCAA 

C7AC77GAGAGAC7GAGGCAGGAGGA7CGC77GAGCC7CGGAGG7CAAGG 

GAGAC^^TCTC^S^a^^^^"^^^^^^^^^^^^^^^ 
S^ar a r a 1 s J, r^S^^^^^^^^^^^^GAGAGAGAGAGAGAGAA 
G^GAGAAAGAAAAGAAAGAAAGAAAGAAAGAAAGATGGAAGGAAGGAAA 
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AGAAAAGGACAAAGAAAAGACCTTTGAACCCTGAATTTCACTTTTAGAGA 

AAATATT7GC7TTTATTTTCTTCTAGTAATTTTATGGTTTAACTTTCTCA 
:5IF^ GCCm ^ mAmG ^TTTATTTTGGTATGAGAAAGTGTG 

AC 5;7;j^ GTmAc ^AAAA^ 




----- * w * x i i_u 1 1**. ut«\agu^TTCTCCCTC7TCAA 

CTTAGGAGTAGCTGGGACCACAGGCATGTACCACCATGCCCAACTAATTT 

TTTTTATTTTTTGTAGAGACAGAGTCTTGCTTGTTGCCCAGTCTTGCAAT 

GTTGTCTCAAACTCCTGGGCTCAAGTGATCCTGTCGCCCCAGCCXCCCAA 

AGCACTGGGATTACACGTGTGAGCCACTGCGCCCAGCTGCCTTTTTATTT 

•II A itIir rCAGATGC ^ G ^ aGCTCCA ^ T AGCACTrATTAA^ 

S^ G I * * WW -C^CTGGTTTTAAATACTGCAAGTTTGGCTTTGAAATACAA 

CCA., x GCCTTATTCAGGCTACATTCAAGGAAATCTGAGACCAAGAGTC" 

GAAGGCCCAGTTTCCTTCCTCAAACCCAGGAGGTGGTAAATGTGTCACTT 




ri^ir iir. * , w * w i <-i-<-avjw,xccacagacaaagcaga 

actcacttatggggaaatctgggaaatacttatctgttaaacctgcccca 
tatggtgactcagattgtctaaagcccaaagcatcattttccaccccaaa 
ccatttcctcctccagacttctctatttctgtggtccagagtcaagatct 

TGATATTACCCTAGAGTCCCCCTTCTGCTCTCCTGCATACCCAGATGCCC 

CTCCCTCCCCAGATCCATTCTCCCACCCTCCCTCCCATCAGTTTGGTGGG 

CCCATCACCGCTTCCCCTCGCCCAGGCTCTCCTTTTGTGCGCTTGGAGCA 

GCAGACTGATCTCCCAGCCTTCACTCACTTCATGTGGTAATCTGTTGTGT 
TCA7CACTGTCAGA&.T P TI '( " i wriTr^wrn w» . . . 




- .C. - w -AC.TTTAGTATGAACTGGA1T7ATGGATTTTTTTAACATTGCT 
TTCAAGTATGGAATAAAGAATTTTATTTATTTATTTATTTATTTATTTGA 
GACTGGG7CTCACTCTGTTGCCCAGGCCAGAATGCAATGGTGCAGTCATA 
TCTCACTGTAACCTCGAATTCCTAGGCTCAAGCCATCCTCCTGCCTCAGC 
CTCCTAAGTAGCTATGACTACGGGTGTGCATCACCACATCTGGCTAATGG 
AATAAAATATTACAATGCCTAATCTTAATTTTCAAAATTTTAAATTACAT 
TGTACCTAATGCCCATGCATTTACTTTTTTCAGTGGGTCAATAGCCCTCA 
CTTTGGCAAAGGTCCQ^GGCCCAAGGTAAGGCCTTACTTTTTCQUVACTC 
ATCTTTTGAAAGACATAAGTGCCTGTAAGTTGTACCACATTAGGTTCTAG 
GAA7TT77CATCAAAGACTTTA7CAGAC7ATTT7CCTCTAAGTTGAGAAA 
GAGCTGGGGGCAGAATATGGCACTGAATGACTGAAGAGAAGGCACTGAAA 
TCAGGCCAGAGGTTGCTGGAAAGAGCAATGAGGAACACCAGCAGCAATGA 
GGAGCCGGTGATGATTTTGGCTTCACAGGGAGGTGTGTACCACACCGATT 
TTATC7C7ACGTGGATGAACCACAGCTGTCGGCTCCCTTGTCTCCAGGAC 
A7CACAC7CTCaCAT7CCCTCCCATCTTCCGGCTTCTGCTTCCCGGGGC 
CCTCATC7GCCCCATCCTGGG7GAACACrGG7CGGTCAACTGC7GGGCGT 
ACC77CCCGC7C7GCACACCCTCCC7GGCCACCCCACCCAC7CTCACGGC 
TCGCAC7GCAGAGGAGCCGCATC7C7AGC7CCAGCCCA7C7GCC7C77C7 
GAGC7C7AAC77CATGTAGGCGAC7CC7GCCGG7G77GCC7CACAGGCCC 
A7CA7ACr7CAAAGCATT7TCCCC7CAGAACACCATG7CC7GGCTGC7CC 
C7CCAGAAGA7ACATCTCTCAAGCACA7CCCCGCGGC7C7CACC7GGATG 
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ACTGCATTCACCTTCTC JVCATTTGCC~CCTTTGGATGTATATAGA. 

GTTTTAAAATACAAATCTGATGTGCTTGCTCTCCTGCTTGAAACACCTCA 

AAACTGCCTTCAGGATAAACCACTGCCCTTGACATGTTCACAGGTTGCCC 

ATGGCCTGGCCCTGCCCATCTCTTCAGCCTCATCTCATGCCCCTTGCCCC 

TCGCTCTCTGGGCTTCTGCCTCCCTAGCCCTCCTTTAGGTTCTCTAACAC 

ACCATAGTCCTTCTAGTGTTGGGGCCTCTGCAAGTGCTGTTCCCATTGCC 

TGAGACATGAATCCCTCTCCCTATCTCTACCTGCACCTTCATCTGATTAA 

T0CCTACCC7TCCTACTCATGATGTTGC7TTCTCAGGGACTCTCTCTGAC 

"TTTTAAACTAATCAGGGTCTCCCCAGTATATATCTTCATAGCACTCTGT 

ATTACTCCTTTCTTAATGACCACCTGCTGTAGACTGAATGTTTGTCTTCC 

TCCAAAATTCATATGTTAAAACCTAGCCCCAAATGTGATAATATTTGGAG 

GAAGGCTCTTTGGGAGGCAGAGCCCTCATGAATGGGATTAGTAGCCTTAT 

AAAAGAGACCCCTGAGGGCTCCCTTGTCCCCrCCACCGTGTAAGGATGCA 

ACAAGAAAGTATGGTCTATGATCCAAAAAGCAGACCCTTGCCAGGTACCC 

AATATGCTGGCACTrGAACTTCCCAGCCTCCAGAACTGTGAGAAATAAAT 

TTCTATTTTTCATAAGCCACCGAGTCTATGGTATTTTGTTATAGGAGCAC 

AAACAGACTGATGTGCCACCCAACCATGATTATACGTGTA ATTTA TGGTT 

TC'CTGCTAGTAGGGATGCACCATGGGGTTAGGAACCACGCTTTTCTTAT 

■^TCCCACACAGTCCTTAGCTCTAAGCATGTTCCTGAATCAAAGATCCCCA 

TC^TTTATGAATGAAGGAGTCAGTGAATGAATTAATGAAAGAACTGATAA 

CCCTCAATAATTATTCCAGCCTTTTATACCTACTATTAACAAGCTTGCAT 

^CTACTCCAAATTTATTGGGCTTTAACTCTATTrTTGGCCAGCCACATTT 

GACATTCCCTGAAGTAAATCTATGCTTTCCATCCTAAGTCAAGGAAGGAC 

CTGGACTAGTAGGGCCAAGAAAGGTCTAAATTCCATGGGTGGGAGAGAGA 

GACTAAATCTGAAAGGAAGAATAGATTGAGCAAAGGTGTAGAGATTGGGG 

AAGGCTGGACATTTGGAGAGAAGGAAAAGGAAACTGACACTAAACCAAAC 

&GTCTCACAAACACAATCTCATCCTTCCAAAACTCTGTGAAGTAAG AATT 

ACTATCCCAGGGCCAGGCACAGTGGCCCATGCCTGTAATCCCAGCACTTT 

GGGAGGCCAAGGTGGGTGGATCACCTGAAGTCAGGAGTTCAAGACCAACC 

TGATCAACATGGTGAAACCCCATCTCTACTAAAAATACAAAATTAGCTGG 

GCATGGTGGTGCACACCTGTAATCCCAGCTACTTGGGAGGCTGAGGCAGG 

AGAATCATTTGAACCTGGGAGGTGGAGGTTGCAGTGAGCAGAGATCGTGC 

CACTGCACTCCAGCCTGGGTGACAGGGAGACTCCGTCTCAAAAAAAAAAA 

AACAAAAAAAAAACCAAAAAAAAAACAAAAAACAAGAATTACTATCCCAG 

TTTTGCAGATGAGGCAATGGAAGCTCTAAAAAGTTAAGTAGGAGAAACAA 

ACATGAAATGTATGTCTTATGCTTTTCCTCATCCTATTTCCTCAGCCTGG 




^CCC^^C^CATCCCTCTGTTTCCTTTCTGTTATAACACTTCTCTATTCT 

GCTGGCATCACAGTCATCTCCACCTGCCTTCCTCACAAGTTAAAAGCTTG 

7TAAGGGCAAGTGGTGTTCTTTGCCACCTCATTCCCCAGGGC7TCTAACA 

CAGTGCCTa^TGCATGACAGAGTrGTAAAACAGGTTACCAAGCTGGCTTC 

AGGCAGGTTTGCATGGAACTGTGCTTTACAGGAATACCTGCTCCCCCCAG 

GCCCTGGGTCTTCCTCCTGAGTCCAGGCTCAGACTCTCTCATCCTGCTCG 

TTCTCTCTTGGGGAGCCACAGTAACTTTGAGCAACTTTGCATGGGATAGA 

ATGGCCTATTAGGGGOU3CACAAAGACCCCATGGAGGGAAGAGTACA6AA 

AGGGAAAACGATAATQITATTTTTTTAAGATGTGCATTTTCrTAACAAAA 

TGCTCTAGTACTTGTCCAGACTTTCAAACTCAAAAACCTAAGCGTCCTTT 

TCTTGAAGATCATCAAAGGCCCCAGTGGTCCTTCAGGTATGTCAAGCTTT 

CrAGAAAATAAAGCTAAGTCATAATCACTTAACACACATGGCTAAATGGC 

CATTrCCTTCTAATTTATCAGCAACTGTTACATATTTCTATACTAGAAAA 

AATrTATATTTATACTCAGGGTGGTAAGTTAAATTTGCCATCGAAGTAAA 

GCAGAAAGAGCGTAGCATGTATGTATATGTAACTCAACTGTGCATGAGAC 

AAAGATGTCTTGAGGAGAATGAGTCTAAGATGCGCCTGAGCAATAGTACC 
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>Concigl 

GCACCCATGTTTCTAAAGGGCATACCAGCCATAATAACAGGATGGGTGAG 
GATATAGACAGCAGATGACAGAGAGGAGAGTGAAAGC7GGGAATCCCAGC 
TAAAGGCATCAGGTTTATGGAATGAGTAGGGGACAATACTGTGTGTGTTT 
ATACACACATGTAmTGTGTGTATATGTATACATGTTTATGTATATATAT 
AATTAi.ATGGTACCATTTCTAATTGACAAAATAATCTATCACATTTTACA 




GATGGATCATC3 
ACAGCTGCAGCAC 

CACCCGCTCTGACTGTCCCTAAGCTCCTGACATCTTCACCCCATGAAACT 

GCTGCTCCTGGGTGCTTCCTGCCTTGCCCTGCCCACCCTTGTACTGTTC- 

CACCATTGACACAGCTGGTGCCCGATGCAC 

>Concig2 

NAAAACGAATCGTCACTATTGAAGCCTGTCTCTCANCGGATCGTGACTAA 




ASCCAAGCTGTGCTCTTACCAACTTGGGCACATGTGGTCAAGACCTCCTG 
ATGCTCTTGTCATGAGTGGGTGGGTGTTCTCAACCTTGGAAAAATAAACT 
TTCTAAATTAACTGW3ACCTGGGTCAGATTTTTGGGGTTCACAGCAACAA 




ATGCCACACTTCCAACATGTGTCCCCATCCACCATCTGTCTTwTTATTGC 
TGCATCCTACCCAGGCCCTGATCTCTGGACCCATTGTTGTATAATTAAGA 
ATTTGGGGCTGGGCATCGTGGCTGTGGCTCACTCCTGTGATCTCAACATT 

TTGGGAAGGTGTATrAGTCAGGATTCCTCCGAAGGATGCAACCCTAGGGA 

TCCTC7C7ATGACCCTATGTCTA 

>Concig3 

CGCGCTCAACCGACCGATTTGCGCGAACCTGCCCATGCCCGAGGACAGTG 
TAATCCTAAAACGTCCCCTGAATCATAAGGATATGAGTGCGAAAGTACGG 
TTCCCTCTGTCACCACTTTCTAACAACGCTATGTCCGATCCGTGCACTAA 
CCCCGCCCAAGTCACTGAAACACTGATGGGCGCTTCCTCTACAGGTATCC 
AGGGCCAATACCACTACTCCCCTCCTCCCTGTCCCCCTTCCACTCTCTAG 
AGGCCGCGGATGCCATCCTCTATTAGCACAACCGAAAACGACGGTGAAAG 
TACCACGAAGCTCACGATCTGATCGGTCGCCCAATGCGGTTACAACGGCT 
GTCATCCCAACCCCCGTCCCATCCTCCATATTGCCCCCCCCTATGAGGAT 
GGCCCTATCATCATGACCTCCAAAATTCTGTCATCTCCCGACGTAATGCC 
3CCCCTCGAACGCCTGACACCATCAAGTCNGTCACCTCCCAAAATACTCC 

TCCTAATCACCAGGCCGAGTATCCCCGGTTCCACAATACCTCCTTGAGAC 

GGGCCGATATCACACAC 

>Contig4 

NGGAGTTTAGGTCAACTAGTAACAAGTGGGATTTGCGACTCAGGTCTATC 
TAATCCTCAAACCCACGTCCTGGACCCCTACACAGACTGCCCTCCCTCAG 
TCC7C7G7G7GGCCTCAAGAAGGGTCTGGACATTCAAGT7TAAAAA7CCA 
TCCAAAGAATCTATGGACCCAGTGGTCTCTGGAGTCAATGTTCTGAGGCT 
CAGAAGGGCCAGGCAGGAGGGAGCCGCCTCTACACAGTCCTGAGCAGAGT 
GGGCTGTGTCCCGGCACAGCAGGGGAGATCATAAACAGAATTCTGCCCTG 
GGCCCTATTTAAGTROGACCTTTAGGCTGCCGGTGTa^TGACCACAGGTC 
CCANGTCTGCACGATTGGCTGTGTGTGGAAAATCTTCACTCCTTGCGGCC 
TTGTCCTTGGCAGAGAGCACCGCTGCTTCCTCTGATGGCCACCAGGGGGA 
GGCGCTCCCCTGGGAACGGTTTGAANGGGAGCCTCACCCCACACGTGCCT 




GTGGC7GCCCCTT T C T 1 1 1 ' CT ' lT 
>Concig5 

GGGAGCTAA CCGCTC aCTGGGATTACAGGTACGCACCACCACGCCTGGCT 
AATTTTGTATTTTTAGTAGAGACGGGGTTTCTCCGTGTTGGTAAGGCTGG 
TCTCGAACTCCCAACCTCAGTTGATCTGCCCGCCTCAGCCTCCCAAAGTG 
C7GGGA7AACAGG7GTGAGCTACCA7GCC7GGGC77AXA7G777C7AG7C 
CAAACATTTAGCTACCTTTTTTTTTTTT7TGAGACGAAGTCTCACTCTGT 
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7GCCCAAGC7GGAGCACAG7GGCACAA7CG7GGC7CGCTGCAGCC7CAAC 

CTCCTCAGGCTCAGGTGATTCTCCCACCTCGGCCTCCCTAGTAGCT GGGA 

CTACAGGTACGCACCACTACACCCTGCTAATTTTTTTGTTTTTGTATTTT 

77G7ACAGA7GGGG777C77CA7G77ACCCANGC7GG7C77GAAC7CC7G 

GGCTCAAGCAATCTGCCTACTTCAGCCTCCCAAAG7GCTAGGATTACAAG 

CATAAGC CAC CATACCCGGCCTACCTACTTTTAACTTGTGGAATTTTCTA 

TAAGGTCANGGATGCCTGNGGGAACAAAAGTTTCTCCCTTGGTATATGCA 

AGTAAAATCCACATGCTGCCTCCC 

>Concig6 

AGGACTGTAGCTGTTGTCTAGTCACCAGGCTGGACTGCTTGGCATGATCT 

CAGCTCACTACAACCTCCACCTCCTGGGTTCAAGGGATTCTCCTGCTTCA 

GCCTTCCAAGTAGCTGGGATTACAGGCATGCACTACCATGCCCGGCTAAT 

TTTGTATTCTTAGTAGAGACGGGGTTTCGCCATGTTGGCCAGGCTGCTCT 

CAAACTCCTGCCCTCAAGTGATCTGCCTGCCTCGGCCTCCCAAAGTGCTG 

GGATTACAGGCGTGAGCCCCCGGCCCACATGTAAAAGTTTATATCTCTGT 

TGTTTCACCTTGTTTTTGACCTAGTCTTTCAGTGATTTGAATCTTGATTC 

AGTCTrTTGTTATTTTAGTGGTACTTCCCAGCTTTGTGTCATCTGTGGAT 

GACATATGAGTCTTGCTTCTTCATGCCAATTTAAGAAG ACTGA ACGGGAA 

TAGGTCAAAGGCATGGCCATGAGCGATTTCTCTCCAGCTTTTCATGGTGT 

TCAGCTTCAAATCTATTCACATATTGGACCTGCAAGCCATCATCTTATCC 

ACAGGCTATCATCATAGGTGAATGTAAATTGGGTTTAGGTGGCCAAGCTG 

AACGTGAGATATNTTC 

>Concia7 

AGCA7G77C7C7AAAGGCC7A7CAAAGC7GACA7CAAAGGGA7AAG77CC 

AGTTACCCAGCTGAAGGGAAGGAGGGTGTTTCAGATAGAGGAAGGATAAG 

CATGACCTATTCAAGGCCAGTGAAAGAAGCGTGCAACGGCCAAGTCAGGA 

GAACCTGAAATTGTGTCAAAGAGCTTGGATGCAAAGAGCCGTGGGAGACT 

ATTGGGGGTTTTAAGCAGGGATATAATATTCATTCAAGCATGCAGTAAAA 

GGTCACTGGCACC7GCCATGGGCCAGGACTCGGGCTCTACATGATTGCGT 

CTGTTTTGGAAATATCACCCTGGCTGTGAGATGAAGAACAGGTAGGAGGG 

TCACAAAACTTGAAGCAGAGAGACTGTTGAGGAAGTAAGCTGTTTTTGTG 

TGGACTGTGGCAATCACAGAGGCAGAGGATATAAATGCACAGAGACACAA 

GGCATGTGGGAGGCAGAAGGAATCAAATACAATGAGTGATCAGATGTGGG 

GTTAGAATGGTGAGTGANAAAGACATACTCAAGGTGACACGCCCAGGTAT 

CTGGGTGGATGGTAAGACATTCATGGACTAGAATCGAAGAGGAGGTGGGG 

ATGGACATTCCTTCCGTTTAGAGGGGTTCACCAGGAGGATTTGCCGGAAC 

A7GGAGAGGAT7AACCAGGAATCCGGTGCCTTTTTCCAAACTGGG7TGGA 

GGTGAATGCTTTGGCACGCTGTGTAGATTTTAGGTGACGGGTGGTGACAA 
TGAG7CCGTGTCGAGCGCTGATTTTTTCGGCCTTTAGAGCGAGATTTATA 
CAATAGAATTTGGCATGAGATTGGATTGCTTTTAGTCAGCCTCTTATAGC 
CTAAAGTCTTTGAGTGACTAGATGACATATCATGTAAGTTGCTGATAGGT 
TTCCAGTTTTCCGCTCCTAGGTCTGCATATTGTACTTTTCCTCTTACTCG 
ACTTAACCAGTACCAACCCAGCTTCTCAACGGATTTATACCATGGCACTT 
TAAAGCCAGCATCACTGACAATGAGCGGTGTGGTGTTACTCGGTAGAATG 
CTCGCAAGGTCGGCTAAAATTGGTCATGAGCTTTCTTTGAACATTGCTCT 
GAAAACGGGAACGCTTTCTCATAAAGAGTAACAGAACGACCGTGTAGTGC 
GAATGAAGCTCGCCATACCATAAGTCGTTTTTGCTCCCGAATATCAGACC 
AGTCAACAAGTGTCAATGGGCTCGTATTGCCCGAACAGATTAAGCTAGCA 
TGCCAACGGGATAAACGAGTCGCTCTTGGTGGAGGG 

GGTOTGGGGCGCCTGGTGTTTCTAAAGAGGATCTCCTGCCAGAAATGGTG 
TGCTGACACTGTTGTCCTCCTTGGTGTGGAACTTTGGTGGGAAGAAAGGT 
TGGAAAGGGAAATTTTGATCCTTGGATTTAACCCGAGTTTGTTACTGATG 




CAGGCCTCAGTGCACGTCAGCTGAGTGAACCAATGAGCAGGTGATGGGTC 
"AGGCAGAGCCCTGTCCTCTTTAGGCAAAAACCCTTGAAACACCGTTCCC 
ATCCTAGCC*"GTGTTCCACCCAAAGCTGGCCAGTCTCCAGGCCCTGCCTG 
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^S^^^°T32 7ATGGTS ^^^G^CCATTCCTGTCCAATG 
TG7GAGGAACTTCATTTCAGACTTGTTGGAAGCCCTGATGTTCAAAAACC 

u x - CAAT Gv.CCCTTCCATTCCTCTTaVGGGAAATGAAAATTGTT^i 
GAAATCCTTTCTCTTTCCSAC^CCAACCAAAC^^CCGCG^Sc^ 

f'^^TXA^l^IICTACCCTGGTTCTCGGCCCTTACTGCGAAGGTG 

^^^^^^^^^^^ 

>ConcialO 

^^^i^J^^TM^CTATGGCGTACACAACGTCTCGCTCAT 




i 'jv.vartvjMAiuCTCTCTGCTGATGTCCCACGGAGCATGCCGTGAGACAACG 
CCACGAACGGCCCTCGGAGANANCTACTCTGCAATGAAGACGTACGATAC 

acacgtaggagtcctagctcaccagccgtatctac^ta^ctStact?- 

GGATACTCACTCGTGCATGCGGraaTaaaTrrETRnr!^*^^^^, 




uiiuLiL^nj 1 1» i GACCTTC7GGCGGTAGCG7NG7GGGCGC7A77AC 
TGTGCGCAGCAGGCGCNTCGTACATGTGTCGGGTAGCGATGCCAGGAGCT 
GTAACATAGCAAGTCGCCCCCCTACTCCTATCACTATCCCTACGCTGGAC 
CGCACTCGAGATCTGAACGCACGTCTTAACCTGCCAGTACTCGTGAGACC 
TATACTGCGCAAGCCTTGGCTAGGAGATCCTGCAGCGCCGGCAAAGAATC 
A S^7 A ^ A ^^^ GCGA ^ ATCSCA CACGCACCATAGAGTATGTGCAT 
ATTAACCTCTGAATGTGCTGCAAGCAGACGGTTGCTCAACATATATATGG 

ATGTGGGGAAATCGCCCTGGTCACCGCCACTTGGCGTCAGGAGGCACCAG 
CACGTCTGAGTGTCACGCACGTTACTC ^ G 
>Contigll 

TTATGCTGTAATGGCACCGCTCACCCTGGGCTTATGAGCAGACCTAACCC 
TCCCANAGTGCTfWSa TT & r*& rmr>» w » ^ » ,-. 




x-wwwjujvnrtHft^, i U iTAtrrCGTCTCCTTT 

ATCATTCATGTCCATATTCTCCCATTTGCTAACATTTATGTTTCTGCTCC 
ACTGGATTCTTTGGATTTTTCTAGAACATACCCATGCTTTGCATTGCCTT 
GGTCTTTGAATATTTGGTCCACTTTTCCTGCAAAGTCCCCTCTCACCTTA 
TCT7CCTGGTAAACTTCCAGCCAACACCTCTTTACTAACCAGAGAAACAT 
GGTTCAACTGTGCACAGGCTTGCACAGAAACTGTTCTCATATTGTCTTGT 
CATTGTCAATGTGGCAGAGATGCACCT7AGATACCTCTTTGAGAAAGGAC 
TCACTGCCGAGCTGCCTGGCACGTGATGAGCTGATAGCTCCAGCTATAGA 
CTCCTTTAGGGTCAACCTCTGCTTTCCAGTTGAGATCATATCCTTTGCAG 




TGGGCAGTCAGAGACCTTAGCTAGTCTGCCTCCGAATCAGAAGGCTCTCT 
CTTGCCACTCTGGCC 

>Concial2 



GCTGTGTCTAAAGATTCACGGCTGTAGTTCCAACTCCCGCCGCCCTCTAC 
TGTGTCCTCTTAATGGCAGTCATTCACCATCTTCCTGTCCCTCCCCTTCA 
TTTCTTGGATGGTGACTGTCACTTTGCTGCAACAGAACCCTGTCCCAATC 
CTTGATGGTTCAA^CACACATAGACATTCTTTTTAACAGGGCGGCCTCT 
CAGG7CTTTAATTTTCTTCCCTCCAATAACCTTGTGATGATCCCCCAGCT 
TAGCCACTTACTGCCAGATCATTACCAGTAACrCCAGCCCCTCCTTAATT 
CTAGTTTCTAATATCCTAATCTGTGACCTCACATTCCAACTTCTTCATTC 
TTATCCCCTGAGTCAAAAAATCCTTTGATCCATGCAATCCATTAAGTCAT 
CTACCTTTTCACCATTCTTCGCCCCACTAGGGTTCTCATTCCTTTATTAC 
CCATATGAAATTCCAAGGCCTGTTGGAATCACTCCCTTGCAGCCACTGTC 
AATACTTCTGCCCCTTTTACTTCATCACCCTTATGTGGCAAAACCACAGC 
CCTGGTGGAGTCGATCCTTACCCCTGCTCTGTGCCAACAGCCGCACACGC 
ATGGCTGATGGAGGTTGGAAAAATCCACACATGCAGTGGGCCCTGTATGT 
CCATATACG7ATCCAACCTCCAGCCTTGCATATGCCTCAGTGCTGCCTGA 
CAACACATTATATGTTTTCCTTAGT7CCTTCAGTC7CCTGGGTGCCTAGG 
TGAG7ATC7CAGACA7CC77C7C7C7C7GCAAAGC7CCAACACC7CCACG 
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TCACATTCAACTGATOACrGTGTCTCCTATGTCACTTAGATCACAGAGGC 

ATACATAAACAAATCCGAGCCACTGCCAGCACTCTGCACATCTGCGAGCA 

TGGCACCCCCAATCTAGGCCTTTCCTGCTGTCAC7TGGGGTGAGCTGATT 

ATACTCGATCCTAGTCATTTCTACTTATGCAC 

>Concigl3 

CTTAAGGCCTCCCTCTAACATTTTAATTTAAGATTGAAAAAGCAAAGATT 

ATT£TGTTTTGGCTGCGCCTATAGTAAAGTAACCCCTATGNCAAATTTTG 

ACACCTTATAGTATTTGACAGGGATAAGTATAAAATTGCT7GATTGATAC 

ATCCACACCCAAATGTATGCTGGGAATGATTTTGTTTCACGGCACTCATT 

ACTTAATTTTTAAAACTCTTATTTAAATTTGCAATGTTTTAAATGACCAT 

CACTTAAAGTAGTAATCAACAGAGGTTAGGAGAACATAACAATACTCTTT 

CTCTTAGAAAATACAACAGAAATATAATTTTTTACAGTTTTGCTCCCAAA 

CTTTTCTCTGTAATAACATGCCTTACTCACCTTTACAATAGGTTTGTTGT 

GAGAATCTTGTAATGTAAACCCTGGGTGTTCTGTGAAGCATTTTTAAACT 

"CTAGTTTACACTGACTCTTATTCAAGTGTTTTTAAAAATATATTTAAAA 

AACTGGCCAGGTGCAGTGGCTCACACCTGTAATCCCAGCACT1TGGGAGG 

CCAAGGCGGGCAGATCACAAGGTCAGGAGTTTGAGACCAGCCTAGCCAAC 

ATAGTAAAACCTCGTCTCTACTAAAAATACAAAAATTAGCTGGGCGTGGT 

GGCGGGCGCCTGTAGTCCCAGCTACTCAGGAGGCTGAGGCAGAAGAATCG 

CTTGAAC C CGGGAGGCAGAGGTTGTGGTGAACCAAGTTTGCGC CAATGCA 

CTGCCAGCCTCTGCAGNGACAGCC 

>Concial4 

GGGGGCGGGCCGAGTGATCCTAAAGCCCGCTCGCTTCACAACAAAGCC-A 

ACAG7CCAATCACTTAATGCTGCATTTATTCCTGGGGAAGCAAGTCTCCT 

^TGCACrrTACACAGTGAGATAATCAGTTTCTCATGTGGACCACTGGGCC 

AGGAGGGCCTGACAAAGGGCAGTCTACATTTCAGACTGGAAACTGCTCCC 

AGAACTATTTCTTTCTAGTTCCCACCTCGGTCTGAGGTGCCTGAGGAGAG 

GGACTCAACAGAGGAAGCAGGAGCATAGCTCAAAGTGTCAGAACATGGAA 

GAGGAAAAGAATCCTCACAAGATTACGTAACTTA CAGGC GTGTTGCTGCT 

TCAGTAGAAGTTTCATCTCCCTCAATCCTGTACACTTTTCCATACATTAC 

ATACTCAAACTGGTCAGCCCTATGGAGCAATAGCAGCAAAGTTATTCTTA 

ACAGTAATTAACAATATAAAAGATCCCATTTAAAAATGGTTACTGGTCAG 

CCGGGCGTGGTNNNTCNANCCTNTAACCCCANCACTTTGGAAAGCATGCG 

GGCGATCCCAAGTCTGATATCGAAACATCTGCCTAACATGTGCAACCCCT 

CTCTACAAAATACAAAAAATATCCGGGCTTGTGTTGGCGCCGTTATCTCA 

CTACCCGGAGCTAAGTAAGAAATGCTTTACCTGGAAGCGATTTTTTTACT 

^ATATCCCCTCTCTTCACCGGGCGCGACCAAATTCTTTAGTATAGGAAAG 

TTTATTGTTTTATGCCTTTGTCAAGGCTCTACTGTATCTTTTCTGTCCAC 

rCAC 

GGTTCTGAACAACAGCAGGCGATTCCTAGCCCTGTACCCGGGGCATTGTC 
CAACACTCGACAGGGCTGAATTCGTCCATAACGGTGTGCCCCTCTGGGAT 
ATAGGATGAAATGAATTGATCTGAGTACCTGGGATGTAAAGTTACTAAAA 
CGCCAGCTAGGTTCACGCCCCGATGCTTAAATATGATCGTGGCCTACACC 
TCGTCCAGCAGAAAAAGTACCCTTTCTTCAACACCACCTCACGATCCTCC 
AATTTAGGAGCTATAAAACTCATGACTCTTTATTTACCCCCTGCAGATTC 
TCAATCCAATAGTGTGTGTCTCCCTGTGAACTCACGGATATACCGATTTT 
CCCCACGTCATTTCCACACGTCGCAATCGCTTAGTCATCCCTATGTATGA 
GAATCATGGATGACTATGTTGAAGTCCATCTATAAAGTTCAACCCCCATC 
TCCGTCCCTGATTCCCCCTCCCCAAGATCACCAACGCGACTCGACATATT 
GTTATCGCCQ^AGGGACCTCTTGCATCCCCCATATCCACTGGTCACCTCC 
CCTCTTGGCTGGAAGTCACCGGGAAGTTCTCCACATGTTGT 

TGCGAGCGATGTTCCTAAACTTTAGCGCCATTGACTCGAGCATGGTCATG 
GCTGTTTCCTG 

AGGGTGTTCCTAAAGGATACTACGTTCCCTAAAGTCCAGAGAAAAAAAAA 
AAAGTAACATAATGTGGCrTATTTGGTATAAAAATTTTACAGGAAGCATT 
G T CAAATATGAAATAGTGTTTGGTTTTGT7TGGGCTGTATTTGTATAAAT 
ATGTTATTGGTATGTGTTCCAAAATTATAGGAAACTCCTATAATTCTGAT 
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;^A^^^™Sr TOGC ^ ( ^CAGACAATTGTCTTGCTTTGGT 

^^TTI^^^? mTATAAT CAGCTATAAAACTCTAACGGGTGCT 

^ttr^^^^ TAGCTOGG AGACTGTGACATCAGAATAGAG 

?t^™S^ A ^ TG ^ T 2CTGAAATATTCATGAATATCAAGC 

^^^^^AGATG^CTAAAAGAATGCTGAAGTAATC 

TTTT.GACiTTTTTTCTTAAAATGTTGATCCTTCGTTTTGTTTTTCAGAG 

TCAAGGAAATTTTTCTGTTGAGATATTGACAGCTTTTAACAATTAAGTAT 

ACTCCAGTGAACACAATTTGGAGCATATTTGTGTCTCTCTATATATATTT 

GGAAACAATNTTTGAGTATTCTTAACTTATTGCAATATT 
>Concigl8 

GGTTGTCTGCTATACCAGTAATGGGATTGCTGGGTCAAATGGTATTTCTG 
GTTCCAGATCCTTGAGGAATTGCCACACTGTCTTCCACAATGGTTGAACT 
AACTGACACTCCCACCAACAGTGTAAAAGCATTCCTATTTCTCCACATCC 
TCTCCAGCATCTGTTGTTTCCTGACTTTTTAATAATCGCCATTCTAACTG 
GCATGAGATGGTATCTCA^GTGGTTTCAATTTGCATTTCTCTAATGACC 
AGTGATGATGAGCTTTTTTTCATGTTTGTTGGCCACATAAATGTCTTCTT 
CTGAGATGTGTCTGTTCATATCTTTTGCCCACTTTTTGATGGGTTTTTTT 
TTCTTGCAAATTTGTTTAAATTCCTTGTAGATTCTGGATATTAGCCCT7T 
GTCAGATGGATAGATTGAAAAAATTTTCTCCTATTCTGTAGGTTGCCTGT 
TCACTCTGACAATAGTTTCTTTTGCTGTGCAGAAGCTTTTCAGTTTAATT 
AGATCCCATTTGTCAATTGGCTTTTGTTGCAATTGCTTTTGGTGT7CTAA 
TCATGAAGTCTTTGCTCATGCCTATGTCCTGAATGGTATTGCCTAGGTTT 
TCTTCTATGGTTTTTATGGTTTTAGGTCTTATGTTTAAATCCTTCTTTTT 

TTTTTTTTTTTTTTTTTGAGATGGAGTCTTAGTCTGTTGCCCAGGCTGGA 

GAGCGAGTGGCGTGTCTNTAGGACGC 

>Concigl9 

GCATGTTGTCTAAAGGTTTGTCTTCCTCCAAAATTCATATGTTAAAACCT 

AGCCCCAAATGTGATAATATTTGGAGGAAGGCTCTTTGGGAGGCAGAGCC 

CTCATGAATGGGATTAGTAGCCTTATAAAAGAGACCCCTGAGGGCTCCCT 

TGTCCCCTCCACCGTGTAAGGATGCAACAAGAAAGTATGGTCTATGATCC 

AAAAAGCAGACCCTTGCCAGGTACCCAATATGCTGGCACTTGAACTTCCC 
AGCCTCCAGAACTGTGAGAAATiai'rT-rr« ra-T-TT«r^r.j l T» » ~ 




ZATGATTATACGTGTAATTTATGGTTTCTCTGCTAGTAGGGATGCACCAT 

GGGGTTAGGAACCACGCTTTTCTTATTTCCCACACAGTCCTTAGCTCTAA 

GCATG * . -CTGAATCAAAGATCCCCATCTTTTATGAATGAAGGAGTCAGT 

GAATGAATTAATGAAAGAACTGATAACCCTCAATAATTATTCCAGCCTTT 
TATACCTACTATTAA 

>Contig20 

ACGGTTCTCTAAAGA CTTT CAAGAGCTGGATTTTATGCTTTAGGTGAAGG 
TGATAAAGTAAAGTGCTTTCACTGTGGAGGGGGGCTAACTGATTGGAAGC 
CCAGCGAAGACCCTTGGGAACAACATGATAAATGGCATCCAGGGTGTAAA 
TATCTGTTAGAACAGAAGACACGAAAATATATAAACAATATTCATTTATC 
CCATTCACTTGAGGAGTGTCTGGTAAGAACTGCTGAAAAAACGCCATCAC 




AATTCAGACATCTG<3GAGCAACTGTAAATG^CTTGAGGTTCTGATTGCAG 
ATCCAGTGAAGGCTCAGAAAGACAGTACACAAGACGAATCAAGTCAGACT 
TCATTGCAGAAAGAGATTAGTACTGAAGAGCAGCTAAGACACCTGCAAGA 
GGAGAAGCTTTGCAAAATCTGTATGGATAGAAATATTGCTGTCGTTTTTA 
TTCCTTGTGGACATCCAGTCACTCGTAAACAATGTGCTGAAGTGGTTGAC 
AAATGTCTCAAGTGGTACGCAGTCATTACTTTCAAGCAAAAAAATTTTAT 
GTCTTAATCTAACGCTATAGTAGGCATATTATGTTCGTATTATCCTGATT 
GAATGTGTGATGTGAACTGACTTTAAGTAATCAGGATTGAATTCCATTAG 
CATTTGGTACCAAGTAGGAAAAAAAAATGTAAAGCCAGTGCTTAGACACA 

>Contig21 

CGCTGTCTTAAGAACTGGGCTAGGAGTGAGCAGTGAGCCAAGATCGCACC 



nC. 4 (5 of 61) 



WO 99/06426 




^rTATT"GTTTCTAAACAC5AGGATAAGGGGCAGAAAAAATGTTruAA^ 
IatCATGATTTTTAAATTTCCAACTGAGATAGGAATAGCACTGGGTAGTC 
a^G^AGGCTGGAAAGACCCAAACAGCAGTTAAAACAGGAACTAGGCAAA. 
QAAACCAAAGGATAACAGTAAACCTAAACTAAGGGAGAGAAAACTGACAA 

AAGCTGACTTAGGATAACTGAC 

CC^GAATATAAGCCGCAAGTAACCAATTAAATTTGTrTTCCAAAATTGTA 
CCTCiAAi ftvwuu^ __-„ «.^-rn» r r aTa.r.rTft,TAACTTCCAGAAG 




GCTA^ATATAAGCTATGATAAAACAGTTGGCCCTCTGTATCATGGGTTTC 
ACAACTGTGGATTCAACTAACrGTGGATGAAAAATACTTGGGA^ 
aMY^CTGCATCTGTACTGCACAAGTGCGTGCTTTTATTCTCGTQ^TTAT 
^SSSSc^TATAA^ATTrATATAGCATTTACGCTGTAT 

TA^OTA^TATAAGTAATCTAGAGATGATTTGAAGTATACAG 
CCT^GATTCTGGTATCCATGGCAGTCCTGGAGTCAATTCTCCTGCAACA 

Sc^atttgttcag^ttctcttctatatcatgtttatatcagw 

ASTAAGAnTTTrAATGTGTrCATATAGGTTTTGTGTATTTTTGGTTGT 

?S?kctagatatatg»gtatttattgctattatgagtagtgtt^ 

^OTAAGTrACCTTGTTTCTAACAACCTTGCTGAACTCTTAWAG^CTCA 
^^AATTAATCTTTCTTA^^ 

^^^HlCi^ii^A^Tr^rr^CTATAACATTAAATAGTAATAAGA 

3GAGAATAAATTTAAATTTCC1 
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TCAGGTTTTAT GCTT GATATAGATTTGTGATATATAGCCTTTCA 

AAAAAAAAATGCTTTCCTAGTAGTCCTAATTTTTTAAAAAAATQ^TGVTA 

AATAGAT GTTGAA CATTATCAAATGCTTTTTCTGCATCTATAGAGATAAT 

CATATGGT'r j rril'A CTATTTATTAATGTAATGAATTAGACCAAT^ 

ATGCCAACTCT7TCTTGTATTTGTAGGGTAAATCCTATGGGATCATAAAA 

TACTTTTAATACATTGTTAGATTTGAAGAGTTAACGCCTTATTTAGAACG 

TTTTCAGTCACATCCATAAGTGAAATGGCACTATAGTGTCTATTACTATT 

ATATTTTT CTGG TTCTGAAACCAAAATTATACTCACCTQVTACAGTAAGT 

TGGGCAACTTTTGTTCTTTTTTTCTGAAAO^TTTGTGTATAGAAGAAAT 

TAACTGTTCCTTGAAAGT7TGATAATAATCATCCAGAAAATTATCCCCAT 

CTAGGGCTTTTACAAAAAGGAGACTCTAGAATGCCATTTCGGTTTCCTTG 

ATGTGTATTGGCCTCTTTCATTTAGGCTTTTGGATTTTTTAGGGCATTTT 

TTCACTATAGGCTTTTTACCGG 

>Contig24 

CATAAACTTCAGGTTGGATGT7CGGTCAAAGTGGTCCGGCGATGCGAAAA 

CGAGAGGGCTCGAGGACTGGGCAGAGAACTATTTGAAGGTAXCTCTCAGG 

GGAAACCAAGCGGAAGGCGGGGAGTAAAATTGGGAGGGAGCGACGGCCTT 

CAAAGAAGGGGCTTGCATTAGATCGGCGAGATCCGGGAGGGTCTGGTGGG 

GAGAAATGACTAGAGGACAAATCTAATGGAGAGACAGACGGAGATAGATA 

TCGTGACAGAGAGAGGGACAGTGACAGCGCACAACAGTGCAGGGTCCATG 

AGTACAAGGCCCTTAAGTGTACACCCCAGCCGGAGTCATGGCAATTCGAT 

TCCTGTACTGACCACCCAGGATTTGGGTAGACTGTACGAGTTAATGAGCA 

TGGTCCCCAACAAGACTGCTTCGACC7CAGATGCAAAGCACACTTCAGGG 

GTCCCCAAGCC^CTCATGTTTTTTGAATGACTGCCATAAGTTCAAAAATT 

CCCACAATTCTCTCAGATTCAATAACTGGGTATAACCACTCATAGAACTC 

AAGAAAATGCTATCATTATTATTACAATTTTATTATAAAGGATACAAATC 

AGAAGGACTAGCCAAATGAGGAGACACATAGAGAGAGGACTAGTAAAAAA 

CAGAGCTTCTGCGTCCTACCTTCAAGGAATCAGGATGCACCACCCTCCCA 

GCAC ATCAA GTGCTCATCAACCAGGAAGTTCCTCTGAGCTCCAATGTCCA 

GAGATTTTAGGGAGGATTCATTACATAGGTATCATTGATTAAATCATTGG 

CCATGTACTTGAACTCAATCTCCAGTGTCCCTCTTCTCCCTAGAGGTCTG 

AAGGG TItjGCT AATATCATGTGGCTCAAAGCCCCAACTCTAATTACCTTT 

TTGGTCTTTTCAGGGACTAGACCCCATCCTGAAGCTATCTACAGGCCCTG 

CCATGAGTTAGCTCATTAACATAACZAAAGACACTTATATTACTCAGAAAA 

TTCCAACAGTTTTAGAAGCTCCATGTCAGGAACCTGGGACATAGATCAAA 

TTL1 1 11 1 1 i 1 1 1 IT 1 lT'mTGGAGACAGGGTCTTGCTGTGTTGCCCAG 

GCTAGAGTGCAACGACAGATCACAGCTCAATGCAGCTTCAACTTCCCAGG 

CTTAAGTGACCTTTCCACCTTAACCTTCCAAGTATCTGGGACCACAGAAA 

ATGGCTAATrATCCTGGCTGA lTm ' A AA C TTTTTTTT' r TTGTAGGGATG 

GGATCGCCC7GTGTTGCCAAGGTTGGTCTCAAACTCCTGGGTTCAAGCAA 

TCATTCTGCCCTGGCCTCTGTGATGGTTAATACTGAGTGTCAACTTGATT 

GGATTGAAGGATACAAAATAATAriTn'GGGTGTGTCTGTGAAGGTTTCG 

CCAAAAGACATTACTTTGJIGTCAGTGGACGGGGAAATCCCCCCTTCCCGV 

TGGGACGGGGAGACCCCCCTCCATCCAGGTAAAAAAATCTAATCACCTGC 

AATGTGGCAGAAATAAAGGAGGGAAAAAACGGGGACCCCTANATGGGTTA 

TTCTCCACCTAATTCTTCCCCCAGG 

>Contig25 

CCATGTATTTCATTTCTACAGACCCTGAGATGAATTTGTGV.TTGCCACGG 
GGTCCTGAAGTTCAAATACTCTATTTGGTATCCTGCCCCTGTGGTTAACT 
GTGATCATTTCACTCACCTTGTTTATGATGAGAGGTGCCACCATCTGGCC 
TCCTCCACTCTGCAATCCTGTTAATTCCTATCAAAGCTGAAAACCTGCTG 
CAGCACCtZACACCATCACCTCCAGCCTAGAGAGGGAAGCTACCAGTGAGC 
TCTCCTGGATGCCGGTGTGCCCCTCGCCAATACATTTCTTCTTAGTCCCT 
TGGTCATCCTGAGGTGTGTGATTAATGGACAGCTATGTGGATTGCACATA 
ATAGATGTACTCCAGCATCTTCATCCCTGATTTTCCTTTACAGAAATCAC 
TCAACCTTAGCAACATGTGAAAATCACCTAAGGACATTCTTTAAATCCCT 
CTGTCCACATGGCAACACAAACCACTTAAATAAGAATCTCCAGGGAGTCA 
CTCAAGCATCAATGlTriTrAAAGCTCCAATTTTAAGGATC^TTACATTA 
TG7CGAAGAAATTATAGTATTTCAGCCTTACTGACTGTAAACCACCACCA 
TATCTAAGCATCCATTAGTCAACCTAGCAGACAATAAACTAACATTACCT 
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~CAGGTACTCAAATCAATTCATTGCATCCCAAATCCCAGATGGGCCCACC 
^^TATTGACAAATTCAGCCCAATCTTGGTTGAACACATTTAGAATATATT 
-CCATGAACAATATCCGGTTGACGAGTTTCTTTAACTTTTTGGAGTTTAA- 
GCCATTTCCTTTCACAGTAGCCTTGTTAATTCCCTGTCAATGCTCCATGG 
GGGTCATGAAGAGACCTCTTATTAACTGTGAAGCAACTTGGCTCAGGTGC 
AGACACTCAAATGCTTCACATGCAGTGGGAAAAGAGAGTGATTGTCTAC 

>Concio26 

^TTAAAAAGAACTGAGTCTTTATTCAGTCGATTCTTCTAATCTATGAACA 

^AGCATCrCTCTCAAAGCATTTAGTCCTTCTTTAATTTCTGTCATTAATT 

TTTTAAAATTTTCATCCTAAAGATTCTGTATATGTTTTGTTGAATTTATG 

^TTAAGCATTTCACTTTCTTGGTAACAATTATAAATGATTTTGTGTTTTT 

^AT T CCACTAGTTCATTTTCAGTGTGTAGAAAAGCAATGAATTTTTGTGT 

GTTGATCTTTGTTCCAACATCTTGCAACATTATTGAACTCATTTATTAGT 

T CTAGGAGGTTTTTrCATTTTTCTTGTAGATACCTTGAGATTTTCTATAT 

AGACAGTCATGTTGTCTGCAAACAGGCACAGTTTTATTTCTTCCTTTTCA 

ATCTATATGCC Tll l'T l 'TTT TT TTTTGCCTTATTGCAGTGGgrAGAACTT 

CTAGCACTATGTCAAATAGCATTGGTGAAAGCAGACATCCTTGTTCCTTG 

TC'TAGAGGAACATTTGGTCTTTAATCTTGATTTAAAAAATTCCTTGCAC 

TAAGTTACCGTGTTTTGCGGGAGGGAGAGGTGGGGTGAGG7GGGGATTTC 

CCCTAATGTTTACAAGCTGGGATTTTCTTTTTCCTGTGTCTAATTATTTT 

"CTCATTGGCTTGAAAAATCTGATAAAACATTTTAGGACTGTGTATAAAA 

""AGAATTAGCCAAGTGCAATGTCTTTATTCAGAAGAAATTTCATGGACGT 

-GTGCC^ACTCTCTTGGCTTCCTGGCTTCATGGCTTTCCAGATCCCACAG 

-AAGCTCTGGATAGTAGAAGTTATAGTAAGACTGACrrCTAAATAAATGA 

AGTGACT-TAACCTTACTGATATGGCTTAAAGAAAAGGAGTGGCCTTTAA 

GATCCATGAACTTCTCAAACAAAAGTGATAACGTTATCTCCATGCATATA 

TAATACTAAATATAATGCAACTGAGAGAAGTAGGCTGTGGTAAGAAAGGA 

GACCCAAGTGCCATCTGAAGGCAGCACTTACCACTCTGCTTCATCCCACC 

GAGGAAACAAAGCATGAGTATTGCCAGATTTTCTTCTGTTTCAAGAAAAG 

CCAGAAATCCAGGTTTTTGCGTGAAATGTCCTGATTTTAATGTTGGGAAC 

TAATTTATATTTTGAAATAACATTGTGTGGGACAAGTGAACTTGTATGTG 

GAACTGCTTTCTCCCAGTGGCGACCAGTTTGGACCGTTGATACTCAGCAA 

GTTCAGCCAAGTGCGCCTTGTCATTGTCAGTCATCAAGGTGATGTGTGAT 

TGGTCAAGCAATTAATTTTGCTCAGCATCTCGTGTGTTTTCAAAAGAACT 

GAAGGTTCATTTGC 

-ScAGAGCACAATGCGTATTCATAGTATATTGACTTAATTTCTAAGTGT 

AAGTGAATTAATCATCTGAATTTTTTATTTTCAGATAGGCTTAACAAATA 

GAACATTCTGTATATAAATGTGTAAATTAGAGTTAATCTTTCCAATCACA 

-AATTCGTTTTATGTGAAAAAGGAATGAACTGTTCCATGCTGGTGGAAAG 

ATAGAGATTATTTrTAGAGGTTrGTCGTTGTGTITTGGGATTCTGTTTTC 

TTTTAAAATTGTAAATATGTACTTGTGTGAATGATTTTTTAAAATGATTT 

^ACCATTTTTGGAAGGGTATTTAATGATAGAATATCATCGAGCCAACATG 

CACTGACATAGAAAGATGTCAAAGATATATTAAGTGTAAAATGCAMAGG 

GAAAACACTATGTACAGTCTGAGCCAAATCAAAGCATGTATGTTTTTTAT 

ATGTGTACAACAAAAGGTTTGGAAAGATATGCGCCGAATTGTTAAATGTG 

GTTTCACTTGAGGGGGTGGGAGGATGGGGCCCCAGAGGGGTTTTTATGGG 

3GCC ^CACTTGGTATTTTTTTCATTTTGTTCTGTTTGAAATTTTGTTT 



^GGCCTCCCAAAGTGCTGGGATTTCAGGTGTGAGCC^CtAUUUU^^-v. 
CTGTT^AAATTTTTTATAAGTATGTACTACTTTTGTAATCAGAATTATTA 
A^GCATTTTACTGATTTAAAAGCTTAGACATGTTCAAATGCCTGCAAA 
ACTACTTAACACTCAGCTTTAGTTTTTCTAATCCAAAAAGGCCGGGCAGT 
-AAT^^GGTGCCAATGTGAAATTTAAACGGTTTTATGTTmCCTG 
-GTTGTGAATGAAAAATATTTCTGAGTGGTGGTTTTTTGACAGGTAGACC 
ATGTCTTGTCTTGTTTCAAAATAAGTATTTCTGATTTTGTAAAATGAAAT 
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ATACAATATGTCACAGATCTTCCAATTAAGTAGTAAGGGTTTATCCTTAA 

TCCTTGCTAATTTAAGCTTGCATAAGTCACTTTACTAAAAGATCTTTGTT 

AAGCTAGTATTTTAAACATCTGTCAGCTTATGTAGGTAAAAGTAGAAGCA 

TGTTTGTACACTGTTGTAGTTATAGTGACAGCTTTCCATGTTGAGGTTCT 

CATATCACCTTGTATCTTGAAGTTTCATGTGAGTTTTTACCATTAGGATG 

ATTAAGATGTATATAGGACAAAATATTAAGTCTTTCCTTTACCTAAGTT^ 

GGrTTCTTGACTAGTAATAGTAGTAGATATTTCTGTAATAAATGTTCTC^ 

CAAGATCCTTAAAATCTCTTGGAAATTATAAAATTA7TGGAAAGACAAGA 

ACAGTTTTTATTCATTATATGCATTATTATCG 

>Contig28 

CTTTCTCAAGAAAAGGGAACTGGAGCAATTAAACATATGTAATTTTTTTT 

TAAAAAACCCTAAACCTAAACATCTACCTATATACAAAAATTAATTAACA 

ATGGATCATGGACTCCAATGTAAAACATGAAACTCTAAACTTCTAGAAAA 

AAAACTGGAGAAAACCTTTGGTACCTATGACAAGGCACAGTTTTTAGACT 

TAACACTAGAAGTGTGAACTATACAAGAAAAAATTAATAATTTGAACCTT 

ATGAAAATCAAATTATTTGCTCTCCAAAAGACCCTGTTAAGftGGATGAAA 

ACTAAATTACAGATTGAGAGAAAATATTTGTAAATCACATATTTGACAAT 

GGACTTGTATCTAAAATATCTAAAGAACTCTCAAAACTCAACATTAAAAA 

AAATATCTAATTAGAAAATGAGTGAACATTTTACGAAAGGGGCCTTATAG 

ATTAGCAAATAAAACACTTGAAAAGATACTCAGCATCACTAGCCATTAGA 

AAAATGCATATTAAAACCACAATAATGTATCGCTACACACATATAAGAAT 

GGTTTATGAAAAAATAGTGATGACACCAACTGTTAGTGAAGATGTGGAGA 

AACACTCATACATTGCTGGTAGAAATGTAAAATGGCATAGCCACTGTGGA 

AAATTATTTGGCAGTTCCTTTTAAAACTAAAAATCAATCTACCACACAAC 

CCAGCAATTTCATTACAGGGCATATATCCCAGAGAAATGAAGATTTATGA 

TCACACAAAAATCTGTACACAAATGTTTTATGG7CACTTTATTCATAATA 

GCCAAAACCTGGAAACTATCCAAATGTCCTTCAATGGGCAAAGGATTAAA 

CACACTGTGATACATCCATACCATGGAATACTACTCAGCAATAATAAGGA 

AAGAATTACTGCTACACACAAGTTGGATTAAACTCAAGGAAATTGTGCTG 

AGTGAAAAATTAACAAGCCAATCTCAAAGGACACATAGTTGATGATTCCA 

TTTGTATAACATTAATTAACACAATTAATTACAGAGATGGAGAACAGAAT 

AGTGGTTGCCAGGGATTATACATGGTGGACGCGGTGAGGCGGGCCTCCAC 

GCCTT GGAGATGAAGGGGGCTACACCCrrTTAAAGCACACCCACGAGAGAG 

TTTTGTGCGGAGGGGCCCAATTTAAGTACTCCGCCCCGGGGGGGGAACAC 

AGGGGCAAACAAAAAAAATTGGCCTTGGGGGTGACCAAACACACAAAAAA 

AAAACAAACACACAAAAAAACAACNATGGGTGGGAGGATTAATCGCCAAA 

TCTGAGTAAGCTATCTGGACAGTACCAATATCGATTTCCCAGTTTTGATG 

TTGTACTATAATAATGCAAGATGTTAACATTGGAAGAAGCTGGCTGAAGG 

GGGCTCAGGAACTCTCTGGACATTTCTTTGTACCTTCCTGTGAATCCATC 

ATTATTACAAAATAGGACATTTTCTAAAGGTTAAATCATTrTAATTTTAA 

AATGTCCCTGTTACTGTTGAAACTCACATCTCCATATACTGATCAAGAAC 

AGCACTAATGGCCCCTGGCCTCCAGGAATTCACAATTCCTACTGACTTTT 

CTTTGAAACCTTGGCCAAGTCGCTTCTCTTCTCTGGTCCTCAATTTTTa^ 

TCTTCAAAATGAAGATTGAATGACTATTAAAATCTCTTGCAATTCTTGAG 

ATGAAGGGTCCTAAAGGAACTGAAGAGGATGCCATGTAATGTAAATATGG 

GTTTTTACTCCATCAGCCAGCCAAGACAGAGGGCAGACACCAAGACATGG 

TAACCAAGGAGGCCATGTGTAAACAAAGACCATTTAGACTTATGCTCTGG 

CCTTTGCAGCCCAACTGGTGTGGCCAGTTGGTGGGGTATGAAGAAAATGG 

GGCCTTCCAGGAACCATGTTGAGTGGAGATAAGCAGGGAGGAATGCAGAA 

GACATGGGGGCAGTGCCAGTCTCAGCCCGAGCCAGCTACACCCACACATG 

GTTATGAAAGACTGACAGCCTGTAAGNTGAACACAGCCCTGCCTCTCTTA 

GATAGGC 

>Contig29 

GCAAATATGATCTCAGATGTGGATTTACTGTAAAGTTCATCAAATTTAAA 
TTTCAGAACACTTAATCTGCAAGAGTCCTTTCCAAGACCCTATACCTAAT 
TTTGTGTTTACAATTTTATATTTGTTTT CTTAAAGAAGAC CAC CAATATA 
AACTATATCCAGCCTTCATGATAAGTACATAGGAAACTATGCAAATAAGG 
GGGAAAAAAAACAAAGAAAAATACCTAGTTTACTAATGGTTCACTTCTGA 
ATAGCACATATTCATAATGATACAAGCACTCATTACTAGTCTAGGAAAAT 
GAAGATATAATTGCATTAGGAAGATCAAGAGGTAGGAAATGTGGATGTGT 
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G7GGTATAGACTAGGGCAGGACAAAGAACCTAAATCCTCATTTTCTAAAG 

ATAATTGTTAATACGTAAAACTCAAAATTCAAGAAGTAACAGTAAAAGCG 

GTCATTAAGAAACAAGCACTAAACACCAGATAGGA AGCGA GAGATGGGGG 

AAGAGGGCGACAATCTGATTATTTTTTGCAACAAATTTTGTAAAACCATT 

TGACTGTTTACATGTAGAACTTGGATCTTTTTTAAAAAACACAAAATAAT 

AATACTATTATTTTTTAACTGGATTTTTGAAAAAGAAGATAAAAGTCTCA 

TTTAGTAATTAAAACTCATTCCAGGTTAGTCCACTCAAAACTTATATTC 

GAAAATTAAAACTTTGGGAGGCTGAGGCAGGCAGATCACCTGAGGTTGGG 

AGTTCGAGACCAGCCTGACCAACACGGAGAAACCCCGTCTCTACTAAAAA 

TACAAAATTAGCTGGGCGTTGTGCATGCCTGTAATCCCAGCTACTCGGGA 

GGCTGAGGCAGGAGAATTGCTTGAACCCGGGAGGCAGAGGTTGCAGTGAG 

CCGAGATCACACCATTGCACTCCAGCCTGGGCAACAAGAGTGAAACTCCA 

7CTCAAAAAAAAAAAAAAAAAAAAATTAAAACCTCTGGAAGTTGAGTTTG 

CAAATATTCATTATGCTCATTTTTAACTTGTATGTTTGGAAAATGTCATG 

ATGAAAATTGAGGTTGGGGGATGAGAAAAAAAGAAAAACATCAACCCCAC 

AGCCCATTCAATTTTCAGCCCGACCCACAGCTCCGGGGAAGG6CAGCAGG 

'CCATCCTTCACTCTTTCTTCACCTCTTTCCCCTCCTTCTGGCTCTTCCA 

CCTCTAATTTGGAGCCCAAAAAAAGGCACTGGGAAATGGAAAAGTCTTTT 

GTACGTGGTACTTGCCGGGGAAGCTGCCATGAAAACCTGGCCCCACGGTG 

GGGAGGGAATGCCCANCTGAGGCCTCGTGCCCATGCTAGGATAGACTCGT 

CCAAACATGTCAGGTGGTCTGACAGGGCAAGCANCANGAAATCATGTATG 

AGTATGAACTGATCTGTATGCAAGGGCGGGGAGAACACGCGGAGGAATGG 

GGCGTGAGAAAACAGCACAGTACGTTTCTTTAGCAGCTGTCTCTGCTCAG 

CCATGGGAGGTCACAGAGAAAGAGGCTTGGAGGCGTTATTTTCAC7GTGA 

GATGTGAGTGTAAAAAAGTGCCCAAGACACAGTGAGTACCAGGGAGATGC 

C CTTTT C CT AC C CGAATGCAGAATGGC CACAGGC CTTAAAACACACACA 

TGGGTCCTCAGAGGAGAGAGGCCTCCACAGTGGACACCCGCATTCTCCCC 

TGGTCAGCAGCAGCAGGGCGAGTGCTGGGCCATCATGAAGCTT CACAGGC 

AATGAGCTCTCAGCAATAACAGGAACAGTGCCTGGGGGACTGTAGCTGCA 

AGACCGATTTTCATGTAAGATGGCCTCTGAGGACTCCGAGATACACCAGG 

CTGAGACTAGCTGGCAGCTCCAAGTTGTTGGTCAGAAGAGAACAGGAACT 

AGGGAAATTGGAATTACTGTTACTACAATTCCTTTACATCCGCACAACCA 

TGAGGTCCAGCGATTTTCTATTA T1 * L"1"1"11"1 1 TT A AGACAGGGTCTCAGT 

ATGTCGCCCAGCATAGAGTGCATTGATGTGATCATGGTTCAGTACAGTAT 

TCACGTCCCAGGCTCAAGTGACCCTCCTGCCTCAGCC7CTCAAGTGGCTG 

GGACAG CAGTTGCATGCTAC CAGGCCAGG CTT7TTTTTTTTTTTTTTTTA 

GTTTCTGTAGAGCACATAGC 

Agt^AACAATGGCACAGGGAAACAAA 

ATCTATCATAAGATGTTAGGTATGGGGGCTCTGCCGGACACAAACTCAAG 

GCTrTATGCTGTTATCTCTTGAGCGAAATCCTGGGAACTTCGTACATTGC 

TTGCTTCAGTACCTTATCAGTTAATCGGACTCTTTGATATGTTGGGAGTC 

AGCGTACACAAGTTAACTCCTTGAGGAAGGGGGTGGGTAAGGAGTCCTTG 

ATGTCTGGTAAATGAAGGAGCGAAATCGAGTTCCTCTGGCTTTCTCAGCT 

AAGGGAGAGCTTATTCATGTS3AAACAAGGCTAAGTGATTAAGGGAGAAA 

GGGAGAGTCTGAAAACAAGGTTAGGTATTACAATGTCAATAAAATTGGTC 

TCCTrATACAGTCCTATCGTAGATTTCTTTCCATCTTTAATCTCCCTCTA 

GCACCACCAGACTTTTTCTCTCTGTACCTTGAGATGTAAATTTTGCTATC 

TGAATTTTCGTCTAAGAGTTGTTTCCTTTAATATGCAAATTTAGGGTTAT 

TTAGCTGACAACTGCCAAAGTAGTGAAACAAGTTATCAAGAACTTGAACG 

T CTAAGGTAGGAAAAAAAAAAGTCTTTATGAATCTATAAGATGTACTTCT 

ATTGGCATGCCTAATACGTCTATGTATTTACGTGTTGTGTACACAGTTTT 

TCACTACTGAAAATATATAGAGGAGTTCTAATTAATTGACTTAAGACAAT 

AAAAGCGCTrGAATCAAATACCTTATCAGGAAAMGGAAAAGACAAGTCA 

AATGCTTGTTCAAGTTTATATAACTTAAGTAAAATCTTTAATAAATAAGC 

TAGCTTTAACATTATTTGAAATGTCTTAAGAATTGCCAGCAGGTTCTGGG 

TTACAGAACTAGTGGGGGTGCAGTGGGGTGAGGGTTGGTGGGGTGGGGGG 

TGGTACGGGGGCTTrGTTTTTTCTTGCTGCCCCCTTCTGGGTTGGGGAAG 

-GGCAGGACC7TGGCAGCACCCCGAGCCGGCATGGCGTTAATAATGGAGG 

GATGCCAGACCCAAGTGGCTAAGGCCCGGCTGCAGAGCCAAGTTGGCATT 
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TCCAGACTGGGGCTCGGGCCGCACCCTCTCCAGGACCCTCCCCTTGTACC 

GAGCAGATTGTCGCGGGCAGTTTGGGCCAGCTGTCC7GGCGTGGAATTTC 

CCAAATTCAACAAATCCTCCAAGAAATCAATCCATCCATTCATCCATCCA 

TCCATCCATCCATCCATGCATCCATCCATCCGTGGCAGATTATGAAGCAT 

GGA7CA7TAC7777GGGA7G7GGA7A7A77CAG77AACAAGGAGCAGC77 

TCAAGAGC7GGA7777A7GC777GGG7GAAG777AGAAACAC7AGC7CCC 
AG 

>Concig31 

ACC7CA7G7GC7C7AGCGCC7C77ACC7CA7GCCC7CCAC7C7CAG7C77 
GCACTCACCCTGCCACACTCAAGGGCTTCCCCAGGTTCCTTCTTAGATTC 
CACCGATAGCTCAGGGACTTTGCACATGCTACGGTCTCTGCCTGGCTCCT 
CCCCAGATCTTCTCATGCCTAGCTGCTTCTCATCAGCACCCCTCAGAGAC 
TGTCCCTGCCCCACCTCTCCAGGTTCCATACCTGCCACCCTCCCCCAATC 
ACGTAACAGTTTCTTCACAGAGCGAGTTACCATCCCAGTATTTCCCTAAC 
T7A777777G7GAC7GG7C7G77GCC7G7C7CCACCACAAGAACA7AAGC 
TGCA7G7GAACAGGAGCC77G7C7A7C77G7CACCCCAG7G0C7G7GACA 
TAACCTGATACACATTAGATGCTCAATGATGTTTGATGAATGAAGTGCTG 
GTAGTCCAACTGTGTTTCCT7GTCTGTGTAAGTATGTCTGTTGTGGTTTC 
CTAAGAACCTACAGCTCTCCCACTGTGACTCCTGTTCTATGGTCCTGATT 
TGCTGGACTAGAATCCTAACCTACATGCTTACTCTTAGTGTCCTCCCCCA 
GAGGCTGAATCCCAGTCCCTAAACCTCCACCAAATGGCTAAGACCtAGCT 
TCCAACCAGACAGGCCTACGCTGAGACCTCAGCACCGCCCTTCTGCGGTC 
TCATCCTTAACGCATCCTTCAGGGCCCAGCTTAAATGTCTCTTCrCCAAG 
GAAGGCTATCC7CTTTCTGCCCCTCAGTGCTCTCCATGCCTCCTCTATGC 
CTCCATGCCTGC7T7CCAACCCTGCAGAGGTGGAGAAGTTGCTAATCTGC 
TG7G77GACA7G7GC7GGGG7GCC77GGGCCAGGGAGCAGGC7GG7GG7G 
7GC7GA7AGCCCG7GGC7G7GCCCAGG7CCA7GC7CAC77CC7GAGCCCC 
AG7GGAG7AGGC7CCC77TCCC77A77GCAGCAC7CAGAGGAAGGACGTG 
CT7C77AGGACAGA7C7GGCCAACC7C7CCC7CG7GAGAGAAGGCCCAGC 
CA7CC7C77GCCC7C777C777C7CC7GCCCCCGAG7AA7AAAGG7GCC7 
GG7 CAGAGC CT7C7AGAAGGAGAC CCAAACA7 C CACCACACA77CCCAG7 
7CCAACCG7CA7CCACATGGC7GGC7G7GCAGG7AAACGCAGAGTC7G77 
7CACACACCCAACCA7CTAGTA77GGA7GGGAGGACAG7AGCG7GACAC7 
C77C7CCAGCC77GAGCCC7AC7G7GGGCCCCACCCAACCCAGA7ACCAG 
AGGAGCCC7G7AC7GGGA7GC7A77GGATGC77G7CCAG7CA7G7ACAAA 
G77AGCCC7TTG77ATA7AGAG77AGC7ACG7ACA7C77CC7C7G7AGGG 
AACCCAAGAGGGGAGAAGAGA7A7G7AG7AGGA777AACC7GCAAATCC7 
C7GC7GAGCACCG7GCAC7ACA7ACAG7GGG7AGCA7GTGG7AGG7GC7C 
AA7AAC7A77GACCGA7C7A77GAA7ACACG7AAGA7CG7GACAC7A7C7 

AAAACGNGGGG7G7GGGGGAAAAACCCCCCCC77G777AGGAAACCCAAA 

77GGACCG7G77GGC 

>Concig32 

GCGCGA77G7GC7AAAGA7CA7GCA7GCC7GA7CAAACG7CCCCA7A7GG 
CG7C7CAGAG7CAAC7CCTTCCCCA7CAG7GCCC7GAC77CGGCA7AACA 
AACC7GGCAGGT7AAG7G ATTAA TCGG7CCTG7ACAAC7G7AGCCC7TAG 
CAGGAAGCAC7AAGC77CG777TCA777A777C77CCC7GGAAC7GCAAG 
AAATGAGGGATGCC77CCGCCA7GAAG7777GC7GA77G7CCAC777G77 
C7CAAGGAGA7A77CACAGTT7T7AA777G7C777C7C7CC7GCATGG7C 
7CCAAACC7G7CCAAAGAAGCCAGC7GGC7CCATCATC7G7AAAATCACC 
A77G7CACCAGAGCACTTGACrrCC7G7TGCCC7ACAA7CCACCTGCACT 
T7AT7TCC7GCCACCA7GA7AA7G7AG7G77AC7ACA7777ACA77CAGC 
7G7AAGAAA7G77ACA7TCAT77AC77AAA7CAAA77AAG7C7GC7CAC7 
CAG7CCCCCACAG7GACCAAC77ATAAAAGAGAAGGTACA7T7aVG7CA7 
CAC7GAGG77C7C7CTACCACTGGAAAAC7GAGGAAGGG7CTGGAG7CCA 
CAG7GGT7AACA7CA77GCC7C7G777777C7CC7AC7CAA7G7AACCA7 
CCAAGG77AC7CACAAA77CACAAAAAGAGG7C77CACC7C7GC7C7CAA 
GACCCAGAGGGC7GGG77C7AAAC7CAAAGGCCAA7G77CCCCAAC7777 
TGCA77G777CAACA77GGGGAAAAC7CGAGGGGATTCAAGAA7GG77A7 
A7AAG7777G7GGAAAAATG7A7AA777777AAAA77AAAA7ACAAAG7A 
T7A7GGAAAGCAC7AAA7A77GAA777A7A7AAA7A77CCAAA7A77T77 
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-^AAATTTTTAGTGAGAAACTTGAGCTTGCTTCTGTGAGATATTTATTTT 
AAAACAGATTTGACACTTAAAATGTCTAATCAAGCCTTTTAAACCATGAT 
^TATCTCTTCAAATTCTTCAGATGCCACCATCAATAAAGAAAC TTTGTTC 
ACACAAGTAAGTGG7AGCAAATGGCAGGGTGTTTATCATTTTTTTTTTTT 
^X^r-TTTTGAGACGGAGTCTCGCTCTGTCGCCCAGGCTGGAGTGCAGTGG 
CGCGATCTCAGCTCACTGCAAGTTCCACCTGCTGGGTTCACGCCCTTCTC 
^TSCCTCAGCCTCCCGAGTAGCTGGGACTACAGGCACCTGCCACCACGCC 
-^^^TAATTTTTTGTATTTrTAGTAGAGACGGGGTTTCACCGTGTTAGCC 
T'nriTrTrnaTrTrr^fTa.rrTr^TGaTrrGrrrc^rTCGGCCTCCCA 




^^TTTTGCCTGATGAAATTTTCCTTGCCACTACTCTGGATGGTTTGATAC 
AmAAATTGTGCTTCCAGGGTACAATTATCCTTTAAATCTATACCTCTT 
^CCTTTCTTTTATTGACAAATATA^TGTTACACTTTTCTGTCATTGCAGC 
C ACAC CACCAGTACACAGATCCCAACAGAGTTGTAAT ATTTTATTAGTTT 
CAGAGTTTCAATATTTTATCACTTTCAATACTTCATGTGCAGGAGTTTTA 
^TTGGTACTTCTTTACAAAATAAATGATGTGCTTCCAAGCATtTCTTTTC 
AATAATTCCAATCAATGTTATTAACTGAGTAATACTAGTATCTGTTTATT 
-ATAAATTCACAGGAAATGCTTTTTTACTTATTAGTCTTTGGAATTCTGT 
TGTTTGTATAAACATCTTTCATGATGGCTTTGTGTCTACCAATAGCACTA 
TTGCCAAAAGGCACCTTTTTCTTGTTCCTTTACTTCACTGGTCCGAAGCC 




TTCATGAAAGAAATGTGTTTCTTATTTTGTACTTGCAGGCA.CTTTTTAAA 
—■""•GTAATCTTTATTCATACTTTAAAATTAAAACAGAGTAATAGAACCC 
ATAGAAGGAAATCAATACCCACGAGTCCATACTGATATAAATAAATAGTT 
« » R^TrirtftrsnrinriaiaT^&ricrrrrrCCTTACAGAAAAATTT 



CAATTAATAAATGAAGAAGGAATTAGGGAAATAl^uv-^i iau^-aa i<««v. 
AACCACAGTAATAATCATTACAGGCAATATCCAAAAATAAATTCCAAAGC 
CAGTGGGCAAAAGTTTGAGGAGATACAGGATATTAACATAGTCTCCAAAT 
AGCTCATGCTATTTATAAATTACAAAAGGAAACATAACAACTGTATAGTG 
AAGAAACTCAGCAGACACCACCTTAGCCAAGTGATCAAGGTTAACGTCAC 
"\GTAATAGGGCTTGTTGACATACTGGACTCCAATCTGATACACTGATAA 
/■.» w»r-T^r^rir'a<w&TTrTTacCAAAAACAGAATTCTAATGTAA 




TTAAGGAAAATGTCAGACAAACCTATTCT<jAUAAA(-ai iv.iA*«~ww« 
CTAACCAATACTTTCAAAATTGTCAAGGTCATAAAGACCAGGCGATGGTC 
„ mw^snriiir'ir'rairtna.ftiT&aLJVCAACTAAATACACAAATGGAA 




^CAATATTAATTTCCTAGATTTGATCATTATAUiAi'j^i i/wt«i. i. j. * 
CATTAGAGGAATCTGGGAGAATGGTATATATGAACTCCACTGTTCATTCA 

ACTTTTTCAGTAACTATTATTTCAAAATAAAGTT 

GGGAGCGGCGGCCCACGCTGATCTCTAAAGCTTTAGACCACATTGGCTCG 
AGCATGGTCATGGCCGTTTCCTG 

GACGTCTTAGCGCTATATTATaAAGAAATATTCACCTCCCTGCTGAGCTT 
ACAGGGTGTACCTAATGTCCAACAATATGAAATCTCTTCAATGAATTGCA 
GCACGTCCATATATAACCCACATGGAAGCTGTCCTCTTTCCTCACCTTCG 
aACTTCC»TGCCAAAGAGGGACCTCTTGGACTCAAATACATCTTAGCaA 

tatagaagatgctggagacttgtaggagaagtggagagggtttacagtgt 
Igcc^Sgaaaacaac^ 

CCATGCCTCAGTCTAGTCAGGAAACCACrAGATCCTGGATGGCrTCTTCT 
CCCTTCCCCTCCTTTCTCTTCTCCTCTCCCTCCCTTGCTCCTCCTTCCTC 
CATCACCCACTCCTTACTTCCAACCAAAACTTGACTAGCTCCAGTCTCAT 
CCCTCCTTATTGAAAACTATTTTACTCAGCCCTCCTCCCCCACTCCTGCC 
CAATCTrTATTCCTTACCTACATCAGACTTCACQ^AAACAAAGGCCAGGA 

TAATAAACAG<»C»AA^^ 




CTGAAaAGAGCATGAGGTTTTCTGCATATCATTACACATTCAATAGAACG 
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TCATGCAGCTGTTAAAAaaGATCTGTAG* ^GGCTATCTTGTGACAGAAAw- 

GCATTGGAGATATACTGTTAGTGACAAAAATAGGTTATAAATGAATTTTT 

CCATGCATGCCTCTATATTTATAAATACACACACATAAAAGACAGGAAGG 

ACAGACATTAAACATTCATAGTGCTTAAGATGATGCATAGTATAATAGTT 

AGGACCATGGCCTTTGGGACAGAAAACTACAGCCTCTCTCCCACTTATCA 

GCCATGGGACCTTGGGCAATTTGCTCAGCCTCAAAGCCCCTGTTCCTTTA 

TC?GTGTGCTGGGGTTGTTGTAAGAGTTAAGTGCAATACACAGAGAGAGA 

GAGAGTACCTAACATGTATTATGTGCTCAGTCAATATGCATCATAGTACT 

CATTGTTACATATGTTCCTAAGTGCTTTATACGTTTTTTCCCTAAGTTGA 

C CAT CTGTTT TTGGCATTATGAAACAT AATGATCCTAACZAAATTAAAATT 

AAAAACATAAAGAATATTTGCCCCAAAAAAATAAAGAACATGAATTCTTC 

AAGTAGCCAAGGGGCO^TAGACAGAAGTAAGCCCTTGGTGGGGCTTAGTT 

GAGAGAAGTCTCCAGAAGGTCTTTCGTGTGTTAAAGAAGAGGGTAACAGG 

GAGGAGGTGGGGAGAGATGTTAACTGAGTCTAAATGAGCACCTGGAAGAA 

GAGATGGGACAGGCCACTTCTGCCTGGACTCCCTGATTGTTAAGAAGAAT 

GAAAAAGAGCAGAAGTCTTCCCTGAGCCCAACTTCACTCCCTGACTTAAC 

CTAGTCTTTGCCCCTTCCCTCTCACTCATGGCTACTTTCTGTGGTCACCT 

TGT TGTA GAAATGGATGTGCAGCCACCTCATCTTTTTCTACCTCCTTCAC 

ATGTTTTAGATAATTTAATGTAGTAGAAGACGGTTACAGCAAAAAATTAC 

AAAAATCAAAATATCTCTGCTATCTACTGTTGCATTTCTAACCATCCCAA 

AACAGTAGCTGAAAACAGCACTCGTGGTCGAGCGCGGTGACTCATGCCTT 

TAATTCCAGATACTCCGGAGGCTGAGGCAAGAGAATCACTTGAACCCGGA 

AGGTGGAGGTTGCAGTGACTCAAGATCATGCCACTGCACTCCAGCCTGGG 

TGACACAGTGAGACTCCGTCTCAAAAAAAAAAAAAAAAAAAGCACTCGTG 

TATTTTGTTCAAGATCTGTGGTTTGGGCAGGGCAGGGCTCAATGAGGACA 

TCTCGTCTCCGTTCCCGCAGTGTCAGGAAGTGTAACTGAGACTGGAGGGT 

CACACAGAAGATGGCTCCCTCAAGTGGCCAGCAAATTGGTGCTTACAATT 

GACAGGGAGCTGTTGACCAAGGGCCCCAATTCCTCTTCCTATGGCCCCTT 

CTCGGGCTGCATGGGCTTCTTTACAGAATGGCAGCTGGATTCCAAGAGCA 

AGTATCACAACCTACAGAAGAGTGGAGGAATATTGAAAGTTCACAGTCTC 

TTAAGACGTTGGCCCAGAAACTGGCAAAAGCTTCATTTCTGCCATGTTCT 

ATTGATCAGTCACAGAACCTGCACCAATTCAAGAGGAGAACATATAGAGG 

A CATCT CTCAATGGGATAAGTGTCAACAAATTTGCATCTATCACAATCTG 

TCTTTTGGGTACAAACTATTTCTATTCCTCCATTATGCAAAATATACTCA 

CAACCTCCCAGGGGTCGCAAAAGCCTCATCCATTTATGGCAAATGTGGCC 

CTTTTAATTTATATAAAATAATTTGCGGGGGCTTCCTTTATATTTTTAAC 

TCCCCTGC 

>Ccntia55 

G7GCAGAGAAGTGATTTAAAGCCCTTCAGAAAGAATGCTTTATTCCCGTG 
GAATTTGGTAACTTGCTTGGGTGTGGGGAGGTTTGTCAGCTTTCTCCACT 
CAAATTATCAGACCCTTTCCATTTAGTGGTAGACCATTTCCCTCGTCCAG 
GCCAAGGGCACATAGTACAGAGAAATAGGGAGTTGTTACCCAGGGAGAGA 
ACTTGGCTCTAAACCTGTAATAGAAAGGTCAGTTCTGGTCTGGAGGGTCA 
ATTTTGATCTTTGGCTCAGATCCAGGAATTGGAACCAAGGCTTTTGAACA 
TTTTAATGCAGGGGATTAAAAAAATGATACGAGTCATTCACGAATATATT 
TGCTTAACATCTAAAGAGATCCCTCAAAACACTAGAAAAAATAAGAACAA 
AAATCTAATAAAACAAAATTTGTTAAACACATTTACgWUVTTTTTTT rr T 
TGGTAAAAATTCAAATGTCATAAATAAAGCTAAAGTTCCTCTTGATGACT 
CGCTCCTCTGCCCTATTCCACTCCAAGTAACCACTATTATCAGTCTTGCC 
AATACCCTTCCAGACCTCTCTACCTCTATATACCATTAGAAGCACATGGT 
TTTGCATTGAGGATGTGCAGT G TTTTGTTTTACGTAAATGTTATCACTCT 
GTTCTTGTTCCATAATTTGCCTTTTTCTCTCAATGATTTGCTTGGCTATC 
TTTCTATTTCAGTAGCATCTCCTTTCTTTTTAACTTACCATTGTTTATTT 
AAC CTTGCCTCTATCAACAGATATGTAGGTTGTTTCTAGTTGATTTCATT 
AAGTATTTATAAACAACGCATCAGTAGATGTCCATAAATTTCTTTACGGA 
AGATGGCAAGTAGTGGAATTGCTGAGCCAAAGAACATGTTTAAAAAACCC 
AAAAAAACTAGACGCTACCAATTTTCTCTCCAAAATGGCCATACCCACTT 
ACCCATACAGAGATGATTTGGAATCTGGCTTCCTCACAAGGTGAGATGCC 
T7CACAGTTTCATTCTTCCTGGCATGTCTTCC C TTTTGTATCTGAGAGAG 
CTGGCAGAATTGTGTCACTAAATCAAGGATAGAGGGTCAAATGACAGCTC 
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AAGCTCACAGGCACCTCI'GCTTTCTTCLwAGACCACCTGCTTTCCTGCLk 
CCAGCTCTGTTCCATCTTATAGAATGGTTGCCACTTGGGTGTCTGCTCCG 
ACAGCCATGTCATCCTTTGCACTGCAGTTATGAAGCAGACAGAGCTAGGA 
GAGGGGCTTTGCCAGCCTCTGCCCTAGCTTGGAGAACTTCAAAAAAGGAG 




TATTTAGAAATTGTTGGCTTGTCGTGCCGAAAGTATGTGT GGTT ACGGGG 

AGTACGGAAATTTCGAGGGGTGGGGGCGAGGCCGTGTGTCCTTTAGCCCG 

GGGTTTTCCCGTCGCATGTTTAAGGGGGGGGAAGAGGGGGGATGTTTTCT 

TTCCGCGAAGGTTTTTGAAGAACGGCGTGG 

>Contig36 

CCCCCCACCGCCACTACTCAACCGGCCGTTCACGAAACAACTCGCCACAT 
CCACTAACCCGCTGGCTCACCACCCACCGCCCTCCCGATCCCCCCAATCC 




CAGCCCCAACCTACCACCAACCCCGACT^UCUUH^cuiw^^^w^w^wwv 
AACCCAAATGCCCACAAAACCAGTGTCCAAACCCTCCTTCCCATCAGTTT 
GGTGGGCCCATCACCGCTTCCCCTGGCCCAGGCTCTCCTTTTGTGCGCTT 
GGAGCAGCAGACTGATCTCCCAGCCTTCACTCACTTCATGTGGTAATCTG 
—--^-ruTri rTriTrEna ATr*TTrTGCATCCCCTCACTACTCTGCTGA 




ACGTAGAAGGCCCAGCACAATTTGCCCCTATGCCACCTACCTCTCTAATC 

^TTTCTCCTTACTCTGACAGACTCTCCGTCTGTCATTTATGTATTCTTTT 

A.TTGCTCTCTT CTACTTTTAGTATGAACTGGATTTATGGATTTTTTTAAC 

ATTGCT"TCAAGTATGGAATAAAGAATTTTATTTATTTATTTATTTATTT 

ATTTGAGACTGGGTCTCACTCTGTTGCCCAGGCCAGAATGCAATGGTGCA 

GTCATATCTCACTGTAACCTCGAATTCCTAGGCTCAAGCCATCCTCCTGC 

CTCAGCCTCCTAAGTAGCTATGACTACGGGTGTGCATCACCACATCTGGC 

TAATGGAATAAAATATTACAATGCCTAATCTTAATTTTCAAAATTTTAAA 

TTACATTGTACCTAATGCCCATGCATTTACTTTTTTCAGTGGGTCAATAG 

CCCTCACTTTGGCAAAGGTCCCAGGCCCAAGGTAAGGCCTTACTTTTTCC 

AAACTCATCTTTTGAAAGACATAAGTGCCTGTAAGTTGTACCACATTAGG 

TTCTAGGAATTTTTQVTCAAAGACTTTATCAGACTATTTTCCTCTAAGTT 

GAGAAAGAGCTGGGGGCAGAATATGGCACTGAATGACTGAAGAGAAGGCA 

CTGAAATCAGGCCAGAGGTTGCTGGAAAGAGCAATGAGGAACACCAGCAG 

CAATGAGGAGCCGGTGATGATTTTGGCTTCACAGGGAGGTGTGTACCACA 




GGGCGTACCTTCCCGCT^ i(jUAUACi-v_ 1 u*_u j. j^^^— ----- 
CACGGCTCGCACTGCAGAGGAGCCGCATCTCTAGCTCCAGCCCATCTGCC 
TCTTCTGAGCTCTAACTTCATGTAGGCGACTCCTGCCGGTGTTGCCTCAC 
. ~* r^Ti-fn k »nr"»TTTTrrrrTPaGZJlLCACCATGTCCTGGC 




TATAGATTGTTTTAAAATACAAATCTGATW.«j»- 1 A ^- iV - ■ LUWi «^l*^? 
ACACCTCAAAACTGCCTTCAGGATAAACCACTGCCCTTGACATGTTCACA 

ggttgcccatggcctggccctgcccatctcttcagcctcatctcatgccc 

CTTGCC CCTCGCTCTCTGGGCTTCTGCCTCCCTAGCCCTCCTTTAGGTTC 

AACACACCATAGTCCTTCTAGTGTTGGGGCCTCTGCAAGTGCTGTTC 

uTrrr-rrrrrrTRTrTrrACCTGCACCTTCAT 




C^GATTAATCCCTACCCTTCtTAt-iUAiUKi^i. j.uv.i. * — - 

TCTCTGACTTTTTAAACTAATCAGGGTCTCCCCAGTATATATCTTCATAG 

CACTCTGTATTACTCCTTTCTTAATGACCACCTGCTGTAGACAGAATGTT 

TGTCTTCCTCCAAAATCATATGTAAAACCTTCCACCAGAGCGATGATTAG 

AGAAGCCTCCC 

GACTGACACTCAGAAGATATTAATAAGAGCACTAATGATGC^TTGCAA 
CATGTCTTTACTGACTTCCAGAAGCTTCTTACAGTAAACATGAAATCAC 

ATAAT^CTrCCACTTTCCTACTGTTTCTTGTTCTGGGCTCTGTCCTGCT 
AC^GTCTAATATCTTGGCCCCTTAAAAGTTGCTAATCTTCCAAACCTCA 
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77CC7G7GACTGGGCCtawTGGTCC77G'i. . CA7GGGCC77GAAGA7AC1 w* 
CTGTACACTTATCTGGAGCATCCAGTGCCTACCACCTGACCCAGATTCC^ 
CATTGCGCTCCTCCCTCCTCCACCTAATGGGATTTGCTCATACCCGTG7G 
GGACCCCTCCCATTTTCCCCAACTGAATACTTATCAAGACAACGCATTGC 
CA7AC7CCC7CG7ACCC7GC7C7GGGCA7CAGAC7GAA7G777G777CCA 
TTGAGGATCTGCAGCTGCATCAG7TTCCCCAGCACCGTCCAACCCCTTGA 
GC^TGGCTAGTCCTAAAGCAGAGAATTAGCCTTTCTATCCCTGCTGCTAT 
ACATGC7GGGACAAATAATAAGAAATGACAGCATTTTATGATAATGCAGG 
C7GCAGGAGGCAGGAGGCAGGAA7CAAA7TCG7GC77A7CAAA7AG7GC7 
CCAA77C777GAA7A77GGACTA7AGAA7A7G7CA7GGA7C7A7GC7CAG 
G7GGG77CCC7A77AC7CAC7CCAC7GAGGCCAGG77G7GGGA77AGC7G 
7CCAAGAGGGAG777CAGTC7CACAGCA7AGGG7CA77C7GAGAA77AC7 
GGCCCACAC77G7G7GGAGACC7CCAGAGAACAGAA7C7GGG77GG7GCC 
A7G7AC77CCAGGAGGAGAGAAG7GGCAGGA7GCCCAGCCCCACAA7CAG 
AGGGGAAGGGGCAGAGCCACA7G7A7GAAGA7CC7CTCCCCAG7ACG7GC 
CAA7CACAGGGCTTCCTAGC7TT7GGGCCAAGGAAACAA7GTGGGAAGCA 
AAAAAGGACAA7T77CTCCTCCC777GCA7GAAGAC7GAGCAG7TTTACC 
AGA77CCCAGGGAAACACCC77CCAC7C7GGG77GAA7GTGAG7GAGAGA 
CA77CAGC7GGAACAC7AGAAAAAC7A777CC7GAGCCAC7CACC777AG 
CCC7AGAAAG7G77GGA7T7G7CC77CA7C777GCCACAG7AGAGAC7GC 
7GA7AGCA7CAGAACT7GGGC7C7GGAA77AGACAGATA7GGG7ACAAA7 
C7GAGC7C7C7CAC7TATTAG7G7GGGA7G7AGAGCAAC7T7TAAAATCC 
77CCAAACC7CAGAC77C7CA7GCA7GA7G7GAGGA77G7AA7AGGGCCC 
ACC7AA7AGGGG777T7GAGAA77AAAAAAG77A77CAA7GAACAGCA77 
7AGCAAGA7GCCTGACCA7TGAGAAAATAACAAA77G777A77A77A77G 
77A77A77AAACA7CITrCC7GCACC77C7GACT 

CAGAAA7AC77AGGATGGGA7GGA77CC7GCA7GGGC7GAG7CAAGGGTG 

CAA7AA7GGAGGAG7GAAGAAGGAAGAAA7GGAGGCAGAAA7CCCCAGGA 

GCCCAGCA7GG7ACAAGGCTGAGC7AG7GC7GCAGAGCCTCC77GGAACA 

GCCACAGAGC77GCA7C7GGCCC7GGAGGAACC7C77C7AGC7GGCAGGA 

CCAGCCACAACAG7GGCCAGGGGATT7CCCAGGGCGTGGGC7CCTCAGGA 

GTTCA777GGACCAAGCCTGCC7GGAGAGGGGT7ATAACAGGGA7CCT7C 

CC7AC7GGCAGG7GA777ACCCC7CGG7GAGAAGC7CAGGCA777G777G 

A7GGAAGG7GGAAGGCCCTGTGCTGGGCCAGTGACTATCAGGGA7GGGCG 

GG7GGC7GGAAAATAGCAAATAAGACAATA7GATAACACAGTTAACCACC 

ACAC7A7G7GAAGCTACAATA7GGG7A7C7G7AA7AGACAAT7CCAA7G7 

AGAGAA7AAT777AAGG7GTCA77C7CCCCGCCAA7GCCATAAGCACACG 

GCC7C7GCC7GGG7T7CTCAC7G7GGAA7G7CCTCC7GG7C7CC7CA7GC 

CCAGAGAG7GGGAAGTAC7CC7AC777AACACCGGC777CC7G7CAT77C 

CN7GCAGCCC7CC7CAGCCCCC7C7GCACAGGGAGG7T7CC7CCC7GCTG 

C7GCAG7GC777G7AC77GTTAG7GG7ACC7GCACACAGG7A77GG7G7C 

CTTGTC7CACCACCCTACATCACTGTAAGCTCCCCAGGAGCAGGCTTCC7 

G777GAC7CACCTG7GATCC7CCACC7CCCACCCTGTAGTGCCTCAAGCA 

77C7G7AGAGCACA7GGACGCC 

>Concig38 

GAC7AA7AAGTAC77CATTA7TTGGG7ATTITCCAAGAACAACA7ATTGT 
A GGAAA CCAT7CTTTCTAAAAAAAAAAGTGTCCTTTTAAAAAGG7GAA7A 
ATTTTTG7C7AAT7CAAAG777A77GAAAAG77A7G7A7AAAACAAGG7A 
AAAGGAACAAGGAAATAAGGGAAA7G7AAAGAAAAT7A7AGAAA7AAAG7 
GG7A7T777TGGTAAGAAAGCT7AAAGAGAAATAA77T7AGGTAAGAAAG 
AA7C77ACCTAAAA7TT7G7GCTAGAATAAAG7GAC7GGC7AAGAAAGGG 
A7G77 CAAAGC7A777A7GACAAACCCACAGCCAA7A7CA7AC7GAA7GG 
GCAAAAGC7GGAAACA7TCCCTTTGAGAACTGGCACAAGACAAGGATGTC 
C7C7C7CACCAC7CCTA7TCAACA7AG7ATCGGAAG7TCTGGCCAGGGCA 
ATCAAGCAAGAGAAAGAAATAAAGGG7A7TCAAATAGGAAGAGAGGAAG7 
CAAA7777C7CCG7TTGCAGATGCA7GA7TGCATATTTAGAAAACCCCA7 
CA777CAGCCCCAAAAC7CC77AAGC7GA7AAGCAACT7CAGCAAAG7C7 
CAGGA7ACAAAATCAA7G7GCAAAAA7CACAGGCATTCCTA7ACACCAA7 
AA7AGAC7AACAGAGAGCCAAA7CA7GAG7GAAC7CCCA77CACAA77GC 
7ACAAAGAGAA7AAAA7ACC7GGGAA7ACAAC77ACAATGGACA7GAAAG 
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^CCTTTTCAGGGTGAAw .GCAAACCAC* -.CTCAAGGAAATAAGAGAGia^A 

ACAAACAAATGGAAAAACATTCCATGCTTATGGATAGGAAGAATCAATAT 

CGTGAAAATGGCCATACTGCCCAAGTAATTTATAGATTCAATGCTATCCC 

CATCAAGCTACCATTGACTTTCTTCACAGAATTAGAAAAAACTAATAGCC 

AAGAC AATC CTAAGCAAAAAGAACAAAGCTGGAGGCATTG 7GCTACCTGA 

C-TCAAACTATACTACAAGGCTGCAGTAACCAAAACAGCATGGTACTGGT 

\CCAAAACAGATATATAGACCAAAAGAACAGAACAGAGGCCTCAGATATA 

\C^CCACACATCTACAACCATCTGATCTTTGACAAACCTAACAAAAATAA 

GCAATGGGGAAAATAATTCCCTATTTAATAAATGATGTTGGGAAAACTGG 

^^AGCCATATGCTGAAAACTGAAACTGGACCCCTTCCTTACAACTTATAC 

^AAAATCAACTCAAGATGGATTAAAGATTTAAACATGGCTGGGCATGGTG 

GCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCGAGATGGGTGGATCAT 

GAGGTCAGGAGATGGAGACCATCCTGACTAACACAGTGAAACCCTGTCTC 




GGAGGTTGCAGTGAGCCAAGATCACGCCACTGCACTCTAGCCTGGGCAAC 

AGAGTGAGACTCCATCTCAATAAATAAATAAATATGGAACTCTCCCAACA 

Q^TAATAAGACAAACCCCCAAATGTTTTAAATGGGCAAAAATATTTGAA 

CAGACACTTCACAAAAGAGGATATGTAAATGGTCAAAAAGCACATGAAAA 

GATGTTCAACACCATTGGTCATCAGGGCAAAGAAAACTAGAACCACAATG 

AGATGCCTCTGTACACCACTTAAATGTCCAAATTAAAGAAAACAAGTTTT 

GGCAAAGTTGTGGAGCAACTGAAATGCTCGTGTATTGCTGGTAGAAAAAC 

AAAATGGCATAACCATCGCAGATAATTTGTTGTCAGTTTCTTACAAAGTT 

AAACATATACTTATTGATATGACAGTTCCATTCCAAGAGAAATGAAAACA 

"rAAGTCCACACAAAGACTTGTACCTGGGTGTTCATGGTAGCTCTATTCAT 

AATTGCCAAAATCTGGAAACAAATCAAATGTCCATCAGCAATGGAATGGA 

^ATACAAATTGTGGTACACATGTACAATAGAAAACTACTCTGCAATGGAG 

AGAAATTAACCATTGACAAACACAAAAACATGGACAAACCTCAAAAACAT 




CATTCATATGAAA 1 UAv-AUAAAWVJVa 1 wvj J. J. vmv»j>j * «wiw>— _ _ _ - . 

GTAGATCTGCAGTTGCCTGGGGATGGGGTGGGAGGTTGACTGCTCTGACG 
CGTAAGGAAATTTGGGGGTAGGTGGGGGATGGTGGGAATATTTTTTGAAT 
TGAATTGGGTAATAGTTTTAATAGGTAAAATATTGGACCCCACAGTATTT 
GAGATAGGTTTCAGTCAATTTAGACAGTTTATTTTGCCAAGGTTAAGGAT 
GCATCCGTGACCCAGCCTCAGGAGGTCCTGACAACCTGTGCTGAAGGCAG 
'CAACATACAGCTTGCTTTTATTCATCTTAGGGAGACATAATACATCAAT 
Z AATGCATGTAAGGTTTACATTGGTTCAATCTGGAAAGGTGAGGGAACTT 




ATGAAGCCT CCGGGTAGCAAGCTTCAGAGGGAATAGATTGTCAAAGTTTC 
CTATCAGACATAAGGTCTGTGTTGATGTTAATGCTGGTCAGCTTTTCCTG 
AATTCCAAAAGGGAGAAGGGTATACTGGGGCATGTCCAACCTTCCCTTCC 
ATCATGACCTGAACTAGTTTTTTCAGGTTAACTTTGGAATGCTCTTGGCC 
AAGAAGAGGGGTCCATTCAGATGGTTGGGGGGGCTTAGAATTTTATTTTT 
GGTTTACAGTGAAGACTTTTCAAGCTAGACACTTAAATGAGTATGTTGCA 

AAATGGCAATTTCTTAGCACGGC 

GACGTCCTAAAGAAATGCTAAGGTAACTCAATTAACTATGCTAGAAAAGA 
GAGTTAAGTATTTAGGAGGATTTAATATGGTGTTAAAGTTGTGAAAATCA 
AAATGGAGACACTAATGTTAAGAAAACCCTGATAAATGGAGCCAGGGAAG 
. . ... . . ~-»m^m/*% (~T"rr_T» rrrrTrt & TP a Tn& &A AAGA.CT 




~GCAAAAAAUftAAAV-UI iva»-rt.i-n*uujvj>.w*i. i«w»r».<-* »»— • 

ATACTACTTTAAAAGGACATGTGCCCAGCAACTGCCTGTCCAACCTCAGA 

^TGGCAATATCTTTGTTATTGATCTTAGTAGCCCAGCATAACTATTTCAA 

AACAGTGATGTAATGCTCA rrn ' T ' l ' I ' lL l 1 1 1 GAAAACTTTTGTCTTCCT 
GTAAAAACCrrTGTCTrCTTTACTTACCCTGAATATGCACAGAGTTTACT 
ATGGAGTGCATATTCCTGTTGCAATGCTCTATTCCCAAACAAACATCATT 
TTCT^TTAGAGAGCCTCTCTCTGTTTGTGATTTAGGTTGGTGATGTAAAG 
-AATGGCATAACTGAACACTGATTCAAAGAAAAGTGGCTTTTCTCTTTGT 
TGTATTAAAAAGAGGCCTTATAAATAGGATAG7AAGATTTGTAAGTTGAA 
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C7TAAAGCATGAAGAAAATTTAGGGGCCAGGCAGGG7GGC7CACACCTuT 

AA7CCCAGCAC777GGGAGGCCAAGACAGGAGGA77GC77GAGCCCAGGA 

GTTCAAGACCAG7CTGGTCAACACAGACCTCATCTTTACTAAAAATAAAA 

AAATTAGGCCAGGTGCAGTGGCTCATGCCTGTAATCCCAGCACTTTGGGA 

GGCCAAGGCGGGAGGATCACTTGAGGTCAGGAGTTCGTGACCAGCCTGGT 

CAACACGA7GAAAC CCCA7CTC7AC7AAAAATACAAAAAAA77AG CTGGG 

TGTGGTGGCGGGCACCTGCAATCCCAGCTACTCGGGAGGCTTCAGGCAGG 

GGAATCACTTGAACCTGGGAGGCGGACATTGCAGTGAGCTGAGATAGTCC 

CACTGCAC7CCAGCC7GGGCGAC7CAGCAAGAC7C7GCC7CAAAAAAAAA 

AAAAAAAATTAGTCAGGTGTGGTAGCACACAGCTGTGGTCCCAGCTACTC 

GGGAGGCTGAGGTGGGAGGATCATCTGAGCCCAGGAGGTCAAGGCTGCGG 

TAAGAGC7GAGATTGTACTACTGCATTCCAGCAGGGGCTACAAAGTGAGA 

CCCTGTCTCAAAAAAAGAAAAAGAAAAAGAAAATTATGTTTTTAAATTTA 

TAATTATAATAAATTTAATTACATAAATTTAAGCTCAAGTAATTGTAAAT 

ATTCTTTCTGTGCACATAAGTTATTCTTGTATTGACCCCACAGGAGCTGG 

CCATTCTTCAAGTCAGAAGGCCTGAGAGAGGAGCTGCCCAGSTGGTCTTC 

ATGGGGCTGTGCGGCCAGTCATCCCCCACAGGTTGACAATCCTTGTGTAC 

TTCATCCTCGTTGGATCCTCTGTATCCCTGACGATGAGCAACTGTGAGGC 

CCGTTTCAGCACTGAGTTCCAGTCAGGAAAACATCCACCCACCCACCACA 

CGCTCACACTTACACACACATTCACACATGCACACACGTTCTGGCTCCGA 

AAAAGAAAAAAAAAAAGCAATTTAAAATAATTCTGATCCTTTGCTTATTT 

CCACAAACTCCATGAAAATTGTACATTGTCCAAGCAACATTTCTTAATAT 

TCTCTTTTTCTCTCATATCCATTTTCCTTACTGC7GTCTCCACCTTTCTC 

TTCCAAACTCCCTGTTAAAATCCCTGCCCCAGCGAACTTTTATTCAATT^ 

TGTGGAATGGAGGCTGCTCTGATTTAAATTAAAAAAAAAAAAAAAATCCC 

TACTCCATGTCCCAGATCCCTAGTTGTTTTTTGTTTTTTGTTTTCCTGAG 

ACAGGGTCTTGTGTCTTCCATGCTGGAGTGCAGTGGCATGATCATGGCTC 

ACTGCAGCCTCAACCTCCTGGGCTCAAGTAAATCTCTTGCGTCAGCCCTC 

C CCAGTA GCTGGGAGTTCAGGTATGTGCTACCATGCCTAGCTAATTTTTT 

TCTTTTATTTTGTAGAGACACGGTCTTGCCAGGTTGCCCAGGCTGGTATA 

GAACCCCTGGGCTTAAGTGATCCTCCTGCCTCGGCTTCCCAAAGTGCTGG 

GATTACAAGTGTGAGGCACTGCACCCAGGCTGGATCCCTGCATT7TTACA 

GATTTAGCATCACAAAAGTCTAAACAATTAGACTGACTAAGGCAGAACTG 

CCCTTATGACAGCAGACATAAGAAGGAAAAGGCCAAAACACTGTGTTAAA 

AATT AT C C AAATGTGAGGAAAAGGCAAAGAG AGTAGGTGTGCCTTTTTAG 

TGTCTAAGCTGCCTGCCCAAGGGGCATCTGATGCTCTCAGGCAGGAGTCC 

ACAAA7777777T7G7AAAAGATCAGATAG7AAA7C7777CAGCG7GAAG 

AGCA7GAGG7C7C7G7CACAAATAC7CAACCACCA7TACAACA7GAAAGC 

AGCCAACAGACAACACATGACAAATGAGTGTGGCTGTGTTCCAGTAAATC 

7TGAT7ACAAAAACAGGCAAGAGGCCAGAGCTGACCCATGGGCCATAGTT 

7GC7GACCCC7TCTGTAAAGGAAAG7AITr77G777GAC7TGC7GTTTAC 

CA77GAT7GAACACAAGGCTCTGTAGAGT7ACT7G7TAACTTGCAGAAGA 

TTGATGAG7GGCAAGTAAT7777A77CACCAGAATA7ANNA77A77C7G7 

TCAG7AGATAAGA7AAACCCAC7G77ATAT7ACTG7CT7G777AGAA7G7 

GACT77GATTC ATTTT TTCACAAATTCATATTAT7GCCCTAArrrGTATA 

7AAG7A7GC7TCTTTTAAAAATATATATTT7T7AATAAATTTGAGACAGG 

G7CTCAC7AGG77GCCCAGCCT777GC7A7AATGAGAGCA7AAAG7GAA7 

T7CACAC77TAGCCTAGTGCATAGA7GGGA7TACAGGCACAAACCACTGC 

A7GCAGC7AAC7T7GCTTCTCATTCCAGCACG77CTATTCCN1TOGNTTTT 

CATA7ACGCGTCTCTTAA7GC 

>Contig40 

CGCA77CAGCCCAAGTTT7CTTGVG7G77AAGGTnTTG77AC7C7G7GC 
CCAAA7G7CC77CCAAAAAGGTTAAGTT777T7ACCT7CCTGCCAACATT 
ATA7GAAAG7G7CCACTTTTGTAGACT777ACCAA7GC7GAC7ACTTTTG 
G7TTCAAAAAAGCTCTC AGTAA T7T7C7A77AA7TAC7T7TACCCTTTTT 
TA7TGAGGG7G77CAACTTTT7AT7GTTAGCA7ATTCTCTCTGGGCTCCA 
77GGACGCC77G GCA GCTTTTTGGTAG7AGG7GCCTTTAGAAAAGTCCTT 
C7CG7C7GGCCCTTTCTGAGCAAATC7AG7GAACAGAA7TGGCTCCA7GC 
7CAGCA77GC77AATACGGTTGATCCAGGGCC7AGGAC7CA77CC77CA7 
TACCA7CCAC77GCATTGTCTTAAAGCAAGGC7C7A77AAT77AAT77GG 



FIG. 4 (17 of 61) 

7////r 



WO 99/06426 PCT/US98/16102 

CA^^CC'GTCCCAGCTCl-rTAGTTTCATTAAACAAA^TTTAGKAAAC. 

'CCCAGTAGATGCCTATGTTGCTTCCTTTTAAAAAATTTTGGAGCTGTTT 

-"^^AGCCrAACCrTTTCTTCAGGGCAGGAGTTAAGTCCCTTCTACTGCA 

f-CC-'GTGAAGATGGTGATTCAAGAGGCAGGGCACCTGTTGCTTTGTGAA 

ACAGTCCACTCTGCAGCTGGGCAGCTC7GTTACTAGAATGTTCTCCCTTC 

TGGGGAGCCAATATTTTGATGTCCTCTGTGAATCTCATCTGCTTATCCCA 

T-rTGTTTATGTCCTTGAAGATGCACAGGTCTGACACCACGAGGTAGCCCT 

TAGAAATTTGATGGCATTTCTGATGTGTCCCCAACTCTTCTCCAACCACT 

CTCCCAGAGCTTGTTTCTTAAGCCCCTTGTGGAGCTGATTGCTTTCCTC 

AAGGCAGCTCAGTTTTTCCCAGTTTGCTCCTGGTGGTCCTGAAATATGAT 

TGACTCCTGAATACTCCAGGTGTGAAGGAGAGTGGGGGTGGCCTTTCTAC 

T T GTCATGGCCTGGGTTTTAAGTTGCTGTCCAGTGGAGCAGAGGTGACTT 

TCCCAGTGAACTACATTTTTTCCCCTCTAAATCCTTAGCAATTTTGTCTC 

CAGAGGCAAGACCTGGCCAAACCATTTGTGTTGAGGATTGAATCAAGAAT 

— - mr** /^n^"P»^«r^^r'<'r'i~rr'aTrTr;afIfZaCMGCGTGTTCCA 




AG C C C C I v-ALl I IjrtHlU^U l^JM/V>\V- 'J-'JVM-ti.rtvJ - - — -■ -- - - 

CTATGATTTTCCTATAAATTAATACATGCCTGTGACAATGTTTAATTTAT 
AAATTAGGCAAAGAGGCCAGGCGCAGTGGCTCAAGCCTGTAATCCCAGCA 
^rwaraTrarvSAfiTTCGAGACCAGCCTG 




ACCAACATGGAGAAAV.i.UV.ijV-u i ^ iav- A<vw ^* rt, -**™z-i:"ii:'Ziri 
ATGGTGGCAGGCGCCTGTAATCCCAGCTACTCGGGAGGCTGAGGCAGGAG 

AATCACTTGAACCCGGGAGGCGGGATTTGCGGTGAGCTGAGATCGTCTCA 

~TGCACTACAGCCTGGGCAACAAGAGTGAAACTCCGACTCAAAAAAAAAA 

AAAAAAATTAGGCAAAGAAAGAAATTAACAACAATAAGTAATGAAATAGA 

ACAATTCTAACAATATACTATAATAAAAGTTGTATGAATGTGGTCTCTTT 

r TCAAAATTACC' l"ri"l ' i IT 1 1 T G AGACAGGGTCTCACTTTATTGCCCAGG 
CTGGAGTGCAGTGGCACGATCACAGCTTACTGCTGCCTCGACCTCCTGGG 
ACCAAGTGATCCTCCCACTTTAGCCTCCTGAGTAGCTGGGACCACAGGCA 
TGCACCACTGTATCTGGATAATTTTGTTTATTTll'rTTTGCAGAGAGAGG 
AGGTCTCACTATGTTrCCCAGGCTGGTTTTGAATGCCTGGGCCCAAGGGA 
TCCTCCTGCCTTGGCCTCCCAAAGTATTGGGATTACAAGCGTGAGCCACC 
ATGCCTGCCCCAAAATTATCTTATTGTTCTATACCCACTCTTCTTCTTGT 
"r w/*»n-ri-!r>r^rrTT«a.TCiGaTGAAGTGAGGTGAC 




TGATGTGGGCATAGTGATL>uAV»i.U i 1 i«jVA.i\jntrti 
TGTCAGAAGGAGGGTCATCTGCTTCGGTGATCCTGGATCATAGAGTCATG 
ATGATGTCAATGGTTGGATGTCAGGAGCAGACGATGTCAATGACTAACGA 
TAAGCTGGACAGGTGGGATGGTGGCACAAGATTTTATCACGCTACTCAGA 
A^GAGCACAATTTAAAACTTCTGAATTGTTTATTTTTGGAAT^CAT 
AATATT T TTGGATTGCAGTTGACTGTGGGTAACTGAAACTGTGGAATGT 
GAGACTGTGGAAAAGTGAGGGAGTACTGTATTATGGAACTGTAACTCTAT 
TCGG7AGGGGAACAGAATTCACATTTGTGGGGCCCAGGTCTCTGCATCTG 
TAGGGATCQ^ATTGTTTCATTTCTCGTTGTAGCAAAAACTTGGCTTCGGA 
ATCAGACAGATTGATGTTTGCTATCATTCTAAATGGGTGCMCTACACTT 
"CCTCAAGAGGTAGTTCTGAAAATTrAACAAAATGTGAATTTCTTGGTAA 
AAAAAAAAAACCTCAAAAATATTCAGTITCCTrTCCTTTGTCTCTGATGT 
^CTCCATCAAATACTGGGAAATATGTGTCTCTCATAGAAATGT^TGGAT 
CTTTGTAATTCTGATTATCCACAAAC CTTGGGGATTAGCTGTTTCAATGT 
TCCTATTTTACAGATAAGAAAATGGAGCCTGTGGTAAGTTAAGTGAGTTA 
CTCATGGCTACTTAACrAATATTTrACTAGGTGATAGGCCAGAGCTAGAG 




CTGTCTGTATGTU T Ai\j 1 vjUU ivj i a u«u»u * ««www»»"- - - - - - -- ~ 

TAGAACTACCGGTTTGTAATGAATTCCACTTGTAAATGACTGACCACTCA 

AG^aAACAAGTGTTTTTTCTATGCTTGACACCTGTTTTGGATGCCAA^^G 

GATACAAATGTAACTTCAGACACTCTGGGCCTCATTTTGCACTCATTAGC 

ATGTCCAA^TTAAAAAGACTGACCACACCAAATATTGGTGAGGATGTGG 

^.i^i Wl -^««ft«iiTr.TiaAATGGTACAAT 
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GA7GAGCAAGC7G7GG7. *J7 CTATGCA*. * GGTATC CTACTCAGCCAGi 

AAAGATATGGCTAAT 

>Concig41 

GACAACAA7G7CA7GCA7AAGA7GACGA7GGCC7GGG7GA7TGA7GCAAA 

CAAGGATAAAGAAAATAATCAATTTTG7CCCCATTTTCAAAGACAGATAG 

CAGCAGCAAGAGTGTAAGTCTGAGGAAAGTCATATTCCTTCCTCCTACAA 

CA^GCACACACACTTACAAAAACAATACACAGACTCCTGGCCAATGGAC 

77CAAAAC7GAGGAGGA7CA77AAA7T7AAA7G77CACCGC7GCA7GAAA 

7CTCCCTGGGTCCTGCCCTCCCTTCCCCACCCTCCTCCACTTGGGCCGGG 

GCACAGCAG7GA77C7C7CACCTC7CAGAG7GAGCCAGTG7TGGC7GCA7 

TGAAGGC7CCAGA7A7GCAAACAGGGCAGA7AT7CC7GGACCAGGG7GCA 

CAGAG7GAGGC7CCAACGCACCC7A77AAC7GCA7GAAGGATGAA7GAGC 

CTCTGGTATGGGCTGGGACAGAAAAAAGGATTCAAGGGGCCCAAAAGGGT 

TTGGGTGGAACCTACCAGGAGCGGCAGTACAGACTCCTTGGGAAGGTGGC 

CATGA7TTAGCCACATTCACCAATAGGATAATCTGGAGAATTTCCTAGCT 

TGAGTTTCTGGGAGAAAGCAGATTTCTGGATTATCTGGTGACAGGTAACA 

GGGCCGAGTTCATCCACAGCCACCTGCAGTGTTAGCACCTTAAGCTGAGT 

TCCTTGCACCAGGATGCTGTCACGCCCAGTCAGTGTGAGACGGTTCTTGG 

CTGAAGGAC7GAAAAGCTTGGGTAAGTGACTTCACCTAAGCCTCTATCTC 

TTGCTCCCGTAAGTCAGGGCTCATTGTGGCTCCTTGCAGGCTTGACTTCA 

GGG77AACAGAGAAAATGAAGG7ACAAG7GCC7TG7GAACTCTGAAAC7C 

CAAACCAGTCATTCTCAAAGTGCCGTCCACCAGTCTAGCACATCAGCATC 

ACTGGAAGCTTG7TTGAAATGTAAATTATCAGGTCCTCCAGAGCTATGTA 

7GAA77AGAAAC7C7GGGAA7GGGGCCCTGCAA7C7A7TTCAACAGGTCC 

7CCAGG7GA77C7GA7GCAAGTTAAAGCCTGAGAAAC7CTGTCC7A7ACA 

AA7GGA7G7CAAC7CAAGCTGCTC77CAGAA7CACC7A7AGCAC7TGT7C 

ACCCGAA7CCC7GAGAA7GGAGC77CAGGAC7GCTAT7TC7CAAAG77TG 

CC7GG7GA7CC7GAGATGGGG7T7GGGGGACAGAGATCCAAGGTGC7ACC 

AGGTGTGAGGAA77G7TAGAAGGCAAACCTGGCTGTCATCTAGGG7GC77 

AAAGGG7ACAGA7CC7AGGAT7CTGCC7CT7ACAGC7GAATCAGACTTTC 

C7AGAA7GGGAT7GC7G7CCAATGGCA7GCC7CC7GGGTGACTCTGATG7 

A7AGCC7GGGC7GGGAACCACCAGAGGATTATC7TCCAT7GACCAAGC7G 

ACAAAC7CGC77AAGGC7C7GAG777CACAC77GATT7TC7AGCCCC7G7 

CCT7CCA7GGA7CACC7GCCCCC77CCCTCCTAA7CAGGAGCACAGTCAG 

7GGA7GCAC7AA7GTGGCCTCTCCTTGGCTGCAGGGAACAGGTGGAAATG 

7GGCCA7AGG7G7GCAGGGCTGCC7GCCATG7A7TAATAGCTACAGAT77 

GAAAGA7CCAAGGACAAGAGACTAGAAAAAAATTTAAAACAGCCAAGCA7 

7GGCCCAG7AA7GGCA777CAGAAA7CCACCAAAA7A77AAGA7GC777T 

7 GAAAAATA7CCAGAGCAC7CA7G7AAAAG7GC77AA77A77AA7AAAAG 

C7GACA7G7G77GGG7ACT7CCTG7GGGTC7GGCAC7AGGCTAATTATG7 

TTT7AGGAG77GAC7CAAA7GC7CCC7G7CA7AA77A7G7GAAAAAA7A7 

AAT7A77AGCTCCATGG7ACAAA7TAAGGAGAGG7TACATAAA7AAAAAG 

GAA7GA7AC7CAAA77AG7AACCAGAGCCCA7GC7CTTAAACAC7A7GC7 

A7TA77TG7GGAC7C77ACA7AGG7GGCAAAAG7CAAAGGC7AGA77GAC 

77C7G7CCAC7TCCAGCCAAGATGAAGTACAAGATTCAGATACACCCTTC 

CGCA77AAACAACTTAGGAATCAGACAAAATATACAAAGCATTGTTTGTT 

ACACA77GGATAACAGACAGCACTAGATAGTCGTGTCTGAGAAAAGCGG7 

GAAA7GAGCTGAGTCT7AGAATTGCCCCAGT7TAC7AAGGGGCATAGTAA 

GGGCA7AGCTGCAGCACAAAGAAGCAGAACCCAACAGAGAC7GGCGTTCA 

CC7GAG77GAGAAAACCAAGTTGAAAATTTAGGAACACTAACACAGATA7 

G7AGGCAAGAG7A7CAGAGAGGAGACAGTTGTAGGGAAAAAGAGAGCTTT 

ACAGAGAGACAGCGAGAGC7CCAGAGACCCGCAGAAGATTGCCCTGACG7 

CAC7AGC7GAGTACCGATCAGTGCA7ACATGTAAGGATATTAC7CAATA7 

G7GGAAAAGAACAGAAGGAA7GA7G7CCAAAGC7CACCCAAAGACAGGAA 

TCA77TATG777CCACCAGCCAGAG7GGAACAACC77G7AACGCA7A7GG 

AGTAC7CAAACGAA7ATTTCCTCAATAATAAGTTCAAATTAAC7GAGAC7 

AAAGCC7GCCCGCTTTGTC7GGACA7GCC7AACAAAGCTTTGAGGGAAGC 

G7CAAAAGAA7GAAACCG7G7CCAAGTAATT7AAC7GTGTCCCAGAAAAA 

AA77CAAGAACA7T7AAATAAA7A77AAAATA7GA7CAAACCCAGCAAGG 

r7AAA77CAAAA7G7C7GGCA7CCA77AAAAAATTACCAGCCT7GAAAA7 
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-SGCGGGAAAATATTA'. JATAATGAA.. .J^AAAAGCAATCAACAGA/ 

AGGCCTAGAAAGTATACATATGATAAAATTAGCAGACATTAAATGGTTAT 

GATTAATTTATTTTATATGTTAAAGAAGGTAGAGAAGAGCATAAGCACAT 




CACAGTGTCTTGCCTTAGTAGAGGGAXAAATGCTGTTCTGTTCCCGCCTA 
ACCCATCT7GAAAGAAAATCTGAAAGGATCAAACTGTATTCAAGTAACCT 
AATCACATCCCAGCACACAGC7CGACTAGTTATAAAAACACAAAATATTA 
ATATCTAGAAACACAAAAATAATATCTAGCACCCAACAAGGTAAAATTCA 
CAATGTCTAGCATTCAATTGAAATTTTCTAGGCCATCAAAGAAGCAGTAA 
AATATGACCTATAAGGCCGGGCACATTGGCTCATGCCTGTAATCCCAGCA 
C-rCTGGGAGGCCAAGGTGGGTGGCTCACCCGGAGGTCAGGAGTTCAAGAC 
-AGCCTGGTCAACATGGTGAGACCTCATCTCTACTAAAAATATAAAAATT 




TGGCACCAATGCACTGCAGCCTCATTAGAGAACATCGGGAAG 
>Concig42 




GCATAGCCTGAAAGCCAACAGTATCACTCCTCCTCTAGGTGTGGCAGAGA 
TGTGAGAGAAGGAGACTGACAGTCTGTGGGTGTGTATGCAGTGTTGGGGG 
AAGCGAGGCACAGGGGACAATACTGTGGTGTACSAAAACTAGTCTAAGGTA 
GCATCAGGAAATTCATGAAACCAAAATGAATTTCATAACAGCACAAGACA 

tt&T — GTTTTTGC CTCCCTCTCATTl"! 1 i i 1 1 1 1 TTTTGAAACAGAGTC 
TTGC-CTGTCATCCATGCTCGTGTGCAGTGGTGCAATCTCGGCTCACTGC 




AGCTGATTACAGGTCTecAtUA(-i-s-u>j».uu«s. * * * 
TAGAGATGGGGTTTTGTAATGTTGGCCAGGCTGCCCTGTCATTTTTTTTT 

TACTAGTGTCCAGTGGAGTTTTTTAGGGGCTACATAACATGATACTGTCA 

TTAATCTAATGGCTAATGAAAGGGATATGTATATGTTTTTGTGTTTAAAA 

VV ... « w<r«rp& Anarrr&TAAAGGGGTCCTG 




GATTGTACATTAGGGTCATCTGGGAAGCTrrAAAATAGTACTGATGCCCA 
ACCTTACCGCAAACCAATTAAGCCAGAATCTCTGTGGATGAGAAGTCTTC 
ATTGTCATCATCACCATGACCATCATCATTGTCACCGTCACTACACCATT 
ATCATCATCATCATATCATCTTCATTATCATTGTTAGTATCTCCATCACC 




GCrTTCACCCCTCCAACACCACAGTAAA-m'TlTUi 1 1 1 1 J 
TATACTTTAAGTTATAGGGTATATGTGCATAATGTGCAGGTTTGTTACAT 
ATGTATACATGTGCCATGTTGGTGTGCIGCACTCATTAACTCGTCATTTA 

^ctaggStatcttctaatgctatccctcgccgctctccccaccccatg 

ACAGGCCCTGGTGTGTGATGTTCCCCACCCTGTGTCCAAGTG^CTCATT 

GTCCMTTCCCACCTATGAGTGAGAACATGTGG^^ 
TTGTGATAGTrTGCTCAGAATGATGGTTTCCAGCTTCATC^CGTCC^TA 

^Satatgaactcatccttttttatggctgcatagtattccatgotg 

TATCTGTGC^CATTTTCTTAATCCAGTCTATCATTGCTGGA^TWGGG 

SggScc^gtctttgctattgtgaatagtgccacagtgaacattcatc 

^CAXGTGTCTTCATAGCAGCATGATTTATAATCCTTTGGGTATA^CCC 




GAATTGCCACACTGTCTACCACAATeeTTiiAfl.1 iaoi i 
AACAGTGTAAAAGCATTCCTATTTCTCCACATCCTCTCCAGCACCTCTTG 

^TTCGTGACTTTTTAGTGATTGCCATTCTAACTGGCACCACAGTAAATTT 

^Stagatt^ataagcaaattgtatttactgtgcaagaattggtctatt 

•^^^AACCATGTGTTGCAAACATACAATGGTTAATTGTGATATTTGCTC 

agta^g^catcagatcactacacagacttgaggtaattccacctaaa 
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AGCAAAGAGAACTGACCCCACATTAACTGAGAAGTCTTTACTTATTTAi i' 

C C CT AT AAAC GAGC CAATATGAAGAGAAGGC CTTAATGTGGTTAACTATG 

TAATTTTTTTCTGACTTTTTGAAATACTGAGAAGAGCTCATGACTCTCCC 

ATCTCCTAATTCTACCTTGGTGGATTTTAGACTGACCACAACTCATGGGT 

AAATGAGGGAAGACGAATAAGAAACCTTGCTTTTTTTTCCTCCTTGTTTT 

TGGCrGGCTGCAGTGGCTCACACCTGTAATCTCATCACTTTGGGAGGCGV 

AG€TGGGAAGATCACTTGAGCTCAGGATTTCAAAACTGGCCTGGGCAACA 

TAGTGAGACCCCATCTC7AAAAAAAAAAAAAAAAAAAAAAAAGGCGACGG 

GCGGTGCGTGCCTGTAATCCTACCTACTCAAAAAGCCGAGGTGGAAAGAT 

CACTTGAGCATGGGAGGTCAAAGCTGCAGTGAACCTTGATTGCACCACTT 

CATTCCAGCCTGGGTGACAAAGCAGGACGCTGCCTCAAAAAAACAAAAAC 

AAAACCTTAATTTTTTGGCTATTCTTTTCTGGTAAGAATGGTATAGAGAT 

GGGGATGAGGATGGCTATTGTATGAGAGAGCAAACAGGGTCCAAGCAGTG 

CTCTGGGCTGTCTAAGGACCAGTAGTCAGCTTAACTTCTCAAATTTCCAG 

GGAAGGAGTTCGGAGTGGTAGAATATCCTGGGTATGCCCAAAGCATCACC 

TTGCAAATAGCCTGTCATGAATAATTTGTTTCATTTGTTATGACTGGAAA 

CTGGCTTTGTGTATGCCAGAGAATGGGGGCAGGAAAGAGAGATTGGTGTC 

TTGAGCTCTCTGTGCCTCTGGGGCAGTGATGCTTTTCCTCTCATGTGGAA 

GGAGAGCATGACTGAAAAGGTGCACAAATAAGGTGTCTGTGAGAGAAATT 

AACCTTCCAGATACAGAGACACAACCTTCCCCAAGAGGTCCTCATTGCTC 

TGCCTTTTTTCCTTTTTm'GCTTGTTCTACCATTAATAACAGAAACTGA 

TTATGACCTCAAAAGAGAGGAGAAAGCGACTCTCCCCACCCTAGAGCTAG 

TTAACCACCATATCTTCCTAGATCTCAGTTCAAGAGTCACTTCCATCCCC 

AATAAAAGCCCTTGAGTGCTGAGCACCTCTCCGTCATAGCATTTGTCCTA 

GGGGTTTTTGTACATTTTCTTGTGTGAAACTTGGGTTGACATCTGTATTT 

CCGACTAGATTACAGTTTCCTCAAGGGTAGGGATGTCTTGCTTGCCATTT 

TCAGTTCCAGCATCTAGACAGTACCTCAAGCAAACAAGGCCGAGGGGGGT 

GCGGATCACGAGGTCAGGAGTTCGAGACCAGCCTGATGAACATGGTGAAA 

CCCCGTCTCTACTAAAAATATAAAAATTAGCCAGGCGTGGTGGCAGGTGC 

CTGTAATTCCAGCTACTCAGGAGTCTGAGGTAGGAGAATCGCTTGAACCC 

GGGAGGTGGAGGTTGCAGTGACCTGAGATCCACTGCACTCCAGCTTGGGT 

GACAGAGCAAGACTTCGTCTCAAAAAAAAAAAAAAAAAAAGAAAGAGAAA 

AGAACATCAAATGAATGAATGAGTGAGATG AATGA GTTAGCAGTGTTGGA 

TTTAAGTGTCAGATTCTTCCCAGCTTGACTTTTTTCTTTGGCTTAGTGAT 

TTTGAGGTCNCAAGATTTATTTTCCTTTCACAAAGGTGATCACTACCATA 

AGATCTTCAGAAAAAGAATGTGGCAAGCCANGTCTCACTAATGCAAATCT 

CTATAACAACTGTATCAGTACT 

>Concia43 

G AGGTGT CAT AAATATGGAC CGATAGATG AATACAGGTAGGATGGGACAC 
AATCTAAGATCCCAGGGGGGGGAGACCACACGCTTGGTTAGGGAGACCCA 
AAGTGGACCGTGTGGCCAGAAGAGTCCCGCACTGCACTCTAGTGACAGTG 
CAGAAAGTCACTGTGGGAAATCTAGAAGTTTCTACAGGTTGCTATTTCAT 
CATAGCACTGTGCAGGCCAACCCTTCCTGCTCCACTGGCTGTTGGGAAAA 
GCTTTCTCTTTTCTTCCTAGCCAGGGAGCTCTCAAAGTGTTCCACTCTCT 
CACCTCCACCCAGGCGTCCAGGTGTGGAGGACACTTGCCGGCTGCTTGTC 
TGCTGACTCATCCCTTGGTTTCA CTTGG AAAACCTACCACCAGCTGGCCT 
CTTTCCAAGCATCAGCCTCCTCATTTTCTTAATCCCTTAGGTGTGATCTC 
ACCTCC ACACA GTAGATTGCCTCAAGG CCCAA TTCCAATAT GAATA AAAA 
TGATTATTTTGTCATCTTCCAATCTTCCTTTTAAAATATTATTTTATAAT 
T C C CTTT AGGAGGATCAC CTAAGTGAAGACTATTTTTACCTAAGAAATGT 
TAAAATGTAAAGACATGGTTGTAATCTGGGGATTCCTGTTAAAATGGCTA 
GCAGACAGAAGTCAGACGACAGG CTAGA AATGT GTGA AGAGTG GTTG CCT 
TTGAAAGGCGGAGTTGGTAATGATTTTCTTCCATTTTTCCATGCTTTCCA 
ATTCTCTACAAAGGCCTTAATATTACTTCGATAACCAGGACCTCTGATAA 
CCTGCCCCCACCGAGTAAAGACTTAGCTGGGAAAGTCAGCTTCATGTGAG 
GTAAAAGGAACCAGGTAATACACAATTCCCACTGCCAACTGTCGGGTGTG 
CAGGC CTGAGC7TCC7GCATGTGGGAGGAAAGAGAAAGAAGAGAGAAACT 
CCAAGATCCAAGAGATCCAGCAAGAAGGCTGGAGTCTGAGGACGCAGAAA 
GCTGAATGGCACAGTTACCACTATTGTGCTGAGGTTCTGTGGCCTCTGGG 
TCTCTTGACAACTGGGCAAAGACCCACAGAAAACTATCTCTAGACCCTAC 
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.-^GTGGGAGGGGAAAGl - JTTCAGATCA * CTACAGGACAGCCACCTGGA>. 
^TCAAATGGCTTACAGTTCCTTCATCCAGAGGGTCTTCATCTAGTACATA 
~ r- * r-r.Tr. rr a A or CTGGGTC CTGG AG ACaTGACGGGGAAC C CATTT AC CA 




-GiiU- i ivjl iav.iwxw*v-#»± 

GGATCGAGGAGAGCTTGTTAGGCAGAGAAAATACCCAAGGGCAAGGGAGA 
AGCCAGCCTGTTCTGAGCACACACAGTGGTTCCATCTAACTGGGCCTCAG 
^GCCAGGTTGGACTGGAGATGGGGCTGAGGAGCTGTCACAGAGCATTCTG 
GACACAGATGTCACATAGTCCCTTGAGGTTAGGGTCCTTAGGCATGGCAG 
CATTGCTTTGAGTTrTTCCTTTTGTAATGTTGCCATTCATGACAATGTGG 




TAAGATGTGAliU wAAVjVjAAAA 1 UiUjVjHiV-Av-'- i. u«ftutv. i * - 

CAGGGCCCAGAGAGAAGCAGATGGCTTCCTGAGGTTTTAAGTAGGTAGAA 
TCAAGGCAGCTGGTAAAGATCTTTTATTACATATAAACTGGAATAAGCCA 
TCTGCTCCAAGACAAAAGAGTAGGCGGAAAACAATACAAGACAGAAATGG 
. , mi-isikarrT'rt^iiryi&aTnTrKaaTT&CAGTAGAGAGTCCAACA 



C~GGCTGCAATCATAAAAATGTAAAAO\AAU*aaaai i iviv.*»vjv»j.«i«v. 
TTACTTAGAAATAATTAGCTGTCATATTAAGTTCACTTGTGTTATGGCTT 
AAATGTGTCCCCCAAAATGTGATGTGTTGGAAACTTGATCCCCAATGCAA 
CAGAGTTGAGAGATGGGACCTTTAAAAGGTGATTAGGTCATAAGGGTTCT 
GCCCTCATAAATGAATTAATACTGTTATCATGAGAGTAGATTCCTGATAA 
AAGGATGATCTCTGCCTCCTCCCCACAGCCCTCTTGTGCATGCTTTCCTG 
CC' T TTCCACCTTCTGCTATGGGATGACACAGCAAGAAGGCCCTCACCAGA 
^GCAGCTCCTTGATCTTGGACTTTCCAGCCTCCAGAACTGTAAGCCAAAC 
AAATTrCTGTTTATTATAAATTACCCAGTCTCAGGTATTCTGTTCTAGAA 
GCACAAAATGGACTAAGATCATTAGATTATCATTTTTTATCAGACTGTTG 
- ... . . . * -> » &»r»r» &<":arcA.<VTGf!ATGCAGCA 



GCACAAAATGGACTAAGATCATTAGATTATUAi iui iftiui^i^i ^ 
AAGTGAAAAATAAAAATCAAATAAAGAAATTAAGAGAGCTGCATGCAGCA 
GCTCATGCCTATAATCCCAGCACTTTGGGAGGCCAAGGCAGGTGGATTGC 
CTGAGCTCAGGAGTTTCAGACCAGCCTGGGCAACACGGTGAAACCCTGTT 
TCTACTAAAATACAAAAAACTAGGCCGGGCGCGGTGGCTCACGTCTGTAA 
TCCCAGCACTTTGGGAGGCCGAGGCGGGTGGATCATGAGGTCAGGAGATC 
GAGACCATCCTGGCTAACAAGGTGAAACCCCGTCTCTACTAAAAATACAA 
AAAAAATTAGCCGGGCGCGGTGGCGGGCGCCTGTAGTCCCAGCTACTCGG 
GAGGCTGAGGCAGGAGAATGGCGTGAAACCCGGGAAGCGGAGCTTGCAGT 
GAGCCGAGATTGCGCCACTGCAGTCCGG^GTCCCGCCTGGGCGACAGAGC 
ffunnnrrnvr & & A&&AJL&AAAA&CTAGCCAGGCATGGTGGTGT 




ATCCAGCCTGGGTGACAAAGCAAGACTACAXTTt-AA^ 
AAAAAGAAAAAAAGAAAAGAAAAAGAAATTAAGAGAAGGGCAGGTATTAA 

CCCCAAATATCCCAC.CATAGGGACACATTAAAGTTTGCTTGGCCACTCCC 



CCCCAAATATCCCAC.CATAGGGACACAi-i AftAia i t^~t^^lrZln 
CTAGCATAATATATGGAATGTCTTCAAGGACCCTCTGTTGTAAATACAM 
GCCCTGCTGGACTTAATACAACCTGCAGGCTTTGAGATCCCTACTCTGTT 

GCCATCTCTCATAGGATTTGCAGACCAAATCCAAATA^^ 
CACTCACAAACATGCAAATCAGAGCAGAAAAGAAACTTCTAAAAG^CCT 

GAAACTACACTTTATGAGAGAAGACAATAGGGACCrGAGGGTGCTAGA^ 
TTTCTCTCTATGCATCTATGTTTCCAGGGCTCACTTTCTCWTAAACTCT 
TAAATTGCTTTTAAAGTAAGGGAACAAGCAAACATTACATTTAAGAGAAA 
TCAATTTCATAAAGAAGGGGGGATGTCCAGGGTACTTrGCTTCCATGTTT 
TGCTTCCATGAATTTGTGTTTAACAGAAGATGCAGAAAAACACACAATTA 

TTGCAAAATCAACMAAaTCCACTCTAAA^ 

GTGTCACAACTGAAAACACATATTGTGGCTAATTATGTGTCACAAATTAG 
AATGACAAGGCAAGAAAAAAAAAACTCTCTGATTAACTAATAGCAGCCAA 
CACAGACAGCCTGTGTAGCTCGACTCTGCTGGTTTATAAAAGGCAGAAGA 
* ^ * & or^rTTrTGTSA.CCGCAACAGGAAGGGCCTCTGCTCTTAATAAA 
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" uv - u - - — -^'^m i ijAAUACACTCTACGCTGAAACAGTAACT 

TTCAATAAACCATCTCTTCTCACCGCACTCTGCGACTTGCCTTGAAT^C'- 
TTTGTGTGCAAGATCCAATAAfirrTrTrT^^r-^^^i^iAi^iir: 




ATAAAGTATATGAATCTGAATACTGTTGGATACAAAGGGAGACTATNNAA 
TGTAATACGTCGCCCGAAATGACTACACTGTTGGTGATCTTTCTTTCAAG 

aagcanaatattgcctcnaacatcctgtacatggtataaaatttS 

aConi:ig44 

CCCAGCAAGAACACCAATACAACCX3GGGGGGCGTTCTTTGTGAGGGGTGG 
GGAGGTCAATTTTTTGGAACCTGCAGCAGGTAACACACAAAACTTCCACA 
GCTGCTACCAGCTTTCCAGGAGAGCCTGTGTACCTGGAGAGGRGAAGGCA 
AGTGCTTCCGAACTTGACTTGATGTCTTAGATTCTGCAATGCGTAGTCTG 
TAGGGACAGGCTGTAGCTTATCCTATAGGCTTGGGCTGGAGTCAGCAAGC 
ATCTGGGCTGGCAGAAGATAAAAGATGCAAAGGTGGAGGAAAGCATACGT 
GGTCTGGAAGACAGACTTGGTGGGTGGGTGGCTGCTACAACACCC7AGTT 
AGAGGTAGAGGGGTAAGTCAGTGTGTCTTCTGCACAGGCCTCTTCCCCAC 
CTCATTCTTCATTTCCCATACAGCCTTGCTGAGTTATTCACAAACATCTG 
ATTCAACTGGAAGCTGGGTTGAGGATGACCTAAAGGACTAGTGTGATGCC 
TGCCCAGGGGTGTGGGCCCATAGTCAGAGTCCAGAGCCTCCTCTCAGCTT 
TTAGCACATCTCACCCACATCCTGGGTCCTTAATTAGCAATATGAAAGCA 
AGCCAAGTGACAAGATTTTGTCCCTGGGAAGTCCAGAAGCACTCCTTTTC 
TCATTTGTATAAGCATAATGATTTGCTTACATAAATAATCATGAAAATTC 
AAATCCCTCTCAGAAATCAGGTCATAAAACCATGAAATGCAGCATGTGGG 
CAAGAATCACAGGGAAAGGTAGGTCTTGGAAAAGAAAGGATGGCAGGGAG 
GAAGAAAGCAGGGTGCCAGGGGCCCTGGGCTGCTGTCCAAGTCAGGTGGC 
TCACCGTCTCTGAGAACATTTCACTTTCTGGTAAATGGGGCAGTTGGAGA 
TAGAAGGGTTG5GTGAATGCCAAGAGTGAGCACAGCTGAGGTCAGTGCTG 
TGCCTGCAGTCCAGGCGGGAGTAGAAATCCTGGGCCCATCTTACCTCCGA 
CCTCATTTCCTCCTCTGTAATAATGTGGGGGTGGGGGAAAGTTCTGGTCA 
TCAGCCCTAGCATTCCATGGTTCATTTCCTCATCAGTGATGGAAAATCAC 

CAAGCAAGAG AAC A(VS A TftT! miiTi *nmn_'Kmr+r*r**r*r*r*'* *m~~^.» 




CAGGTGGCTGAATGTACAGAAGCTGCCAATCATGAAAGATCTGGGGTACA 
GCATGCCAAGCAGAGGAAATGCGAGTGCAAAGGCCCCGAGATTGGATGTG 
GGCTTAGCACAAATGTGGCATGGCAAGAAGGCCAGTGTGGCTGAAGCAGC 
ATGAACAATGGGTGGAGGGGCTGAGAGGACAGAGGAGCAGGAAAGAGCCA 
GGCTTGGGTAGGAGAGGTGTCAACTTGATATATGATGCAAAGCCCTTGGA 
GGTTCCCAAACACAAAAGCAATGATCTAATATATGGTTTTAAAAATGCCA 
CTCTTGGCCGGGCGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAG 
GCCGAGGCGGGTGGATCATGAGGTCAGGAGATCGAGACCATCCTGGCTAA 
CAAGGTGAAACCCCGTCTCTACTAAAAATACAAAAAATTAGCCGGGCGCG 
GTGGCGGGCGCCTGTAGTCCCAGCTACTCGGGAGGCTGAGGCAGGAGAAT 
GGCGTGAACCCGGGAGGCGGAGCTTGCAGTGAGCCGAGATTGCGCCACTG 
CAGTCCGCAGTCCGGCCTGGGCGACAGAGCGAGACTCCGTCTCAAAAAAA 
AAAAAAAAAAAAAAATGCCACTCTTGCTGTGAAAAATTGACCCTGGGGGA 
AGGAGGAGTAGAAATGTCAAAAGTGGAAGCAGACCACTCAGGAGGTCAGG 
GCAATGGACTGTGCAGGAGAGACTGACATCTTAGACTCGGGCAATAGGAG 
AGAAGGTGGTGAGGATTATATTCTGGGCATAAAGGCAACAGAACTAGCTG 
ATGGCGTCAACGTAGGAGATGAGGGAAAGAAAGAAATCAAAGGGCATTCA 
7AGGTTTGAGGGTTGAGTAACTGGGGATATTTAACAGAAATGGAGAAGTC 
TGGGGAAGGGGCAAGTATTGTGGGGGCAGGGGTCAAAAGTTCTGTATTTT 
GGCCAAGTTAATTAATATTTGAGATACCTCTTAGGTGTCCAAGTGAAGAT 
GTCAAACAGTCAATTGAATACAAAATCTGAATCTTAGCCCAGGATGGTCT 
CACACCTGTAATCCCAGCACTTTGGGAGGCTGAGGTGAGAGGATCACTTG 
AGGCCAGGAGTTTGTGATCAGCCTGGGCAATAGAGCAAGACCCTGTCTCC 
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ACACACACACACACACA. AAAAAGTCA'i. wCAGGCATGGTGGCAGATGC- . 
GTAGTCCCAGCTACTCAGGAAGCTGAGGCAGGAGGATCACTTGAGCCCAT 
GGT7CAAGGCTGCAGTGAGCTATAATCACATCACTCAATACTACACTCCA 
GCCTGGATGACAGAGAGAGACCTCATTTATTAAAATAAAATTTAAAAAAA 
'TAATTAAAAATAAATCCAAATCTTTCCTGAGATTCATATTCAGGAGTAA 




CAGCACAGTGAGGTTGGAAGGAAGAAATGGAGCTAACAAAGGAGACAAAA 
GAGGAGTAGCCAGTGAGAAGAGAGAAACATCTGGAGAGAAGAGAGAGCAG 




TCAGCGGCCCTAGAACAAAAGTGAAGAAGAGCTTGAGGACGGAAGCCTGA 
CAGGAGTGAACTGAGGAGAGAATGAAAGGTGGAGACATGGAGCCAAGGAG 
CACTGAGACTCCCTTGAGTAGTTTTGCTGTAAAATAAAAGTGAGTGCAGA 
GACGGGGCAGGGGGACAGAGAAATGCAGGGGTAGCTGGAGGGAGCCACAG 
AATCAAAAGAGGGTTTTTGTGTTTAAGATGGTAGTTGTCACAXAGCACAT 
TAGTAAGTTCATGTGAATCACAACGTAGGTGAGACAGATCACTAATGCAG 
GAGTCAAATCCTTGCAGAGCCCCCAGAGGAGGTGATGAAGGGAAGTGATG 
GACATCATTCAGATGCAAGTAGGTTAGCAATTCCTGGGGTACAAATAGGA 
GGTGACTCCTTTCTGATTGCTCCTGTTTTCTGAATGAGATAGCACATAAA 




CACTAGCCGCTGCCACATCCTCTTGGGGCCATCCTCTGCCACACTCCACA 
TATTGCTGTGGTTTGCTTGCAACCCCTGGAAGGTCCTACTGGCTGCTCCT 
AG^GAGTCTGGGCGGCATCTCTCCCTTACTCGTTATCACATGGTGCTGT 
AAGCAGTGGCCACACACTrTAGCrGGTGGGATGGGCCATCACAGGCAGTA 
AATGCGAAAGACTGCTCAGATTTTAAAGCACCCATGAATCAGTAGAATGA 




TAAAAAAATTAGTTGAGTAGGATAAAW;^J.AiM»i»v J «i«i iwv -^7^2r 

CCAGATAGGAGGTGCAAAATTGTCCTTACATAAATCAGATGGAAAAMTT 

GAA^GCAGATAAGATAAAATAGGTAAGCATGACATTTAAAAGGTACTCA 

^CGTGGTTACAAAACCAACTCACAACTAAAAAGOTAGGACCTCTC 

GCTGACTTAGGAGCCTGATCCCAACTTTGAGAATGACTCAGTGTGCTACC 

CTGTGGCTAGTGTAGACCAATGATCCTGTCTCAGAGTCACTAGCCAACAG 

cccaStcaagtaattc^ctttgactcacsaaacctcagtgtcag^w 




-gacctaaaggaaacccattgcag<^cgcttttgtgttaagtttaca^ 
ctgagacaattctgcacattaaagaatataaaatattaccttgtaattcc 

AAmGAAATGTGTAATTGACATTAGACTTCTATTTTAATTTC^ 
TAAAACAATGTGGTTAAGTTTGTAAAAGGTGTCTGAATTTTGAGT 
ITACTACATTTTTTTTTAA TlTrCl 1 1 1 1 1 1 X IGGAGTTTTAGGGATTGC 
TTAGATGGCTAGAAAGATCGCTAGGCACATGTCC 

GATGTGTGTACGTGTGTGCAAATACCGTGCCTTTTTTC 

GAAACAGAGTCrCACTCTGTCGCCCAGGCTAGAATGTAG^ 

CAGCTCACTGCAACCTCCGCCTCCCAGGTTCCAGTGATTCTCCCGCCTCA 




^T^CAGGCATGAGCCACCATGCCTGGCCAATACTGTGCCATTTWTTA 

5rAG^CTTGAG^CCATO»TTTTGGa 
CA^TACTGCACAAATACCAAGGGACAACTGTAT^ 
^TAATAAGCAGGACGCTGAAGGTAATTGCCCCAATAAAGTCATGAT 
-r^rtGTGTCTGAACCTCAGCCAGTTTTCATACTCAGGACCTATTGGCT 

^^I^ & T&TgTGCAGTCCTTCCCCAAAAGGACC TTA(^TTTACC 
ATAC^GCTATGTCCTGCGTGAGAGGGTAATACTCAGATTTTTTTTTTTTT 
FACACAACGTOTACTGTGTrGCCCACACTGGAGTGCACTGGCT 
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. ^ A w . . «w w ^ w^w iuv. .Lin. . -TGGGCTCAAGTGATTCTC ' 

Iglf^^^JCTGG^TTACAGGCGCCCGCCACCATGCCTG 

r CTAA in i i G a ATTTTTAGTAGAG a car. a rrrr-nw™ 




GAGATTCCAGCGTGCGCGGCCATACCCGGCCGG^AMTCTTTATATATTC 

T x T^TGGCATGTATTTTTAATTTTTAATGGTGTCr CTCAATGAAAAAAGC 

TTAACACTTAAATGAGGTCAAATTGATCACCTTTTTAT^ATGGTTGAT^ 
^J^TCATGTGTAAGG^^ 

aaagatttcttgtgtattttgtcctaaaagttttaaagttttgotScc 
catctgtgcacatttcacatttgctacatctcactgactgcScSSgc 




* ww«««v.v.u i irtftrtAi ijiALCtiTCiGAGGATCCCAGCCCAAGAGATTC 

tgtatgactggtctaagatgtggtctgggcaccaggtgatcccagtgtgc 
agccaggcctgaggccactggatttggtggtaaatgaggtaactatcaag 

GGTACAGACGTTGGTTGCCAAraf^:rTT(w:rT-rr!s ^ -t-t-™ . ^^.mm™ 




rTTTGCTTGTTTGTTTGTTTGTTTGTTTGTTTTTTCCTTTTCAAGAATGA 
GGTTGAGCCAGACTTTGACAGCTGGGTGGGAAGTGAACATGTGGTGATTG 
^^^^E^^^^^^^^TAATAATTAGAGAGTGGGC 
GTGGGAAGACATGCTGGGGAGAGTGAGCAGGCCGGTTAGCCCTGGTAGAG 
GGTGCAAGAGAGCAGTGCGGAATCTGCCAGGGAGACAGGTGGGTGACCAG 
GGTGCCAAGGGTGTGGCTTTTCCCAGGTTCCCATGGACACAGCCATCCTC 
CCAGATGCCCAGCCTAGCTGTGAGTGAGCAAGAGTTCTGGATTGTCTCTC 
TCACTCTGTCTTTTTCTCTCATTCCAGAAACAAAGCAGTGACTGGTACTT 
AGGAGGAGAATCAGGTCAAGTTGGGAGAAACTTGCTTCTGCTCAGGGGAG 
CAGAAGCAAGAATGGAGGCCCCACCCATGCTGGAAGATGATGAGGGTTTT 
GGTTCAGGGAGGAGGAATATTGGGGATCTAAAGGGGCCTGGGAGTGGGGC 
AGGACCCTGCCTTAGGACAGGTAGAAACATTTTCTATAAAAAATGGGGTG 
GAGGTTGATGGTAGGACCAGGCATCTTTAGTTGGCTCCCTGGAGTGTCAA 



>jCCC . TGAGATGGTCTTTAAAAGCCATGCAGTGGGG7TTGAATCTGGTGT 
TCAAGCTCATAGGTTATTAACATAATGACACTTGGAAACTATTTGGGAGA 
GCTCAAGTGAGTGGCCTGGAAGTTCTGTGTTGGTGCAGGAGGTGACTTAG 
GATGTGCTGCTCCAGACTCATATCTTTGACTGCACACCTGATGCTTCATC 
TGGCTATCC7GTAAGCACCTTCAACTTAACATGTCCTACACAGAACTCTT 
GATATTCCTGTTCCTCCCCCAGTTCCTCAGTTCTTACCAAATGTTCTTCC 




CTCTGAAATAAATCCTGCCTTTGTCTCCCAGTTCACTCCAGCCACCCATC 
CTGGGGCTGCACCCTCCTCCTTCCAAGCCCTCTCCCTTTCCTTCCTGGTG 
CTGCCTGTCATGTCAAGCATATGCATCAGTGCGACCAGGACATTTGAAAT 
GCAACCAGTACAATTGGGCGCGGTTATGCCTACCAGTTTTTCTTCCTTAA 
ACATTTTATATTTATGTTTGAAAGCATGCCACCTTTCTTCACTTGCCAAC 
TTGACAGATTTATTAGTTGACAACATCCGCTGATAGCATCAGTAATAAGT 
TAATTGTTTTTGCACATGTAGCTTTAATTATTCTCATTATCATTTATAGG 
AGTTATTCTTTGTAAAGGGTAACTGAGTTTTCCAAAACAAACAGAAATTT 
GGGGTGGGCC GATGGAGCGTGACTCATGAAATCAGATTCTTAGAAGGACC 
TCGGCAAGTCTCTGGGTTfirTrtTTa iTri*.nnn*>rv'tTr*r*r»r<r*r<~+ — 




TACTTGCCTAGGTAGTGCTTGTTTCTACTGAACTGTCAGGGATCCAATTC 
TTTGTGG7CTAAGTAACAATACTCAGATTCACAAGGAATTGATTAATAAG 
CCAGAATGCCAATGTATTACATTTTTGATGAAGACCATATTTACAGTGAT 
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GAATG7GATTGCCATTCACATACCT7TCTGGGGATGATGATTC7TGTAC7 
^TTATTTTAAAAGACATAGAAAACTAACTTAAGAATCAGATTGCTTGGCT 
GGGCACAGTGGCTCATGCCTGTAATGCCAGCACTTTGGGAGGCCAAGGTG 
AGTGGATTGCTTGAGCTCAGGAGTTTGAGATCAGCCTGGGCAACATGGTG 
AAATCCCA7C7C7ACCAAAAA7ACAAAAAAAAAAAAAAAAACAACCAAAA 
AGAATAAATTAGCTAGGTGTGATGGTGCGTGCTTGTAGTTCCAGCTACTT 




CCGTCTCAAAAAAAAAAAAATCAGATTGCTTTATTGCTGGTTTTCTTTCT 
AAAACTGAGATTGGGTCCCATCATCCCCTGGCCCCCATTGGTTAATGGTT 
^CTCCTTTGTCTATTGAATAAAATACAGATGTCTGCTTTTGGCAACATGG 
^TGAATGTAGACACTGCAGGGTCTTCCTGACTCAAAATGATTTAGGCTTA 
GATAAAACACATTTGGAAATGCATTTCTGGATTAACACCAAGGAAAGGAG 
ATCTCTTTAAATCCCCTTTCTGTTCCCCCCTCCCTACCCCCTCCAATTGG 
GCTTAAGTAAGAAGGGTGGTTACCCGCTAGTAAACCCCCTTCGAAGGGGG 
TCTTCTCCTCTAAGGGAAAACCCTTGTTTTGACATTTGCTTCSATGGGCC 
CTTGTATTTTGTTCCTTGCTAAACGGGTGCTAAACCAGGGGCCTCCTCTT 

>Concig46 

AAGGCTTTTAGAATATTTGCACACTTTAGAAATGGAAATGTTrTTGGGGG 

GCGAGTTGTCTTAATATTTCATTTTTCTAGCTTGTGTGACATCCTTTTGA 

AAGCAGCAATTCTGGCCTTTGTGAGAGATGG7GAATGCCTGCAGG7GTGT 

GGACCAGT3CGTCCCTTCCTTCCTACATGCACGGCCCCCAGCTGGGCCCA 

GCAGAG7GC7G77ACAGAATAATTTCCAAGGGCTG7GTC7C7AACC7TTG 

GTCTTGTCCCCCATTGCTGTAGATTTGGCCAATTGACTTCATAAGTGCCT 

CTTATGAACATAGATGTTGGCAATGGAAGTTGAGGACCAGTCAGTGGTTG 

TTTTATTGAACACACAGCGTAAATCCCAACACAATGCTGACCTAAGAGAA 

TTCCAGCCACTCTGATTCTCAGTCTCTTTATATCTGAAAGGGTTCTGTTC 

CACTTTTTCCCAGATCAAAATGTCCCTGCAGCTACTCAGCAGAGCTGTCG 

CAACTTATACGTAGAAGAGGTAACAGTCCACAAACAGAAAGGCACAGGAC 

GAGAGTGGTCTGGGTGATGCTTCCTGTGGGGGAAAAGGTGATGAGGGTGC 

ATCTGCACACCTATGTTCATAGGTAAGTCTGGGAGGAGGTGACCTCCCCT 

TTGGTTGAGGTGCTGAGGCGTCTTGTTAGAATGGCACTATTCCATTTATC 

TGATGC^GTCTGTGGGAATTTTGTGGTATGGCCACCACAGGTACCATGCT 

GGGAACAATGCCAGATACTGCCTGCTAAGCCACAGCATGAGTCACATGAG 

CATTTGTGGGCTTrGGGAACTAAAGTTATTGAACGATAGTTATCTGAAAA 

3GAATTTAGGGAAAGGGGACTTTAGTCCAGCGAACAGTTTGCAAACCAGG 

GGGAAGGCAGCCTTCAGCGTAAAATGAAGACGTGTGTGCCCCAAATAACA 

AAGGGAGAGTT7GTCTTTTAGAGAGTAAATGTCCACGCAAGGTTCCACTT 

AGGCAAATGAAAGATGCAAACrTGCTTAGTTCTGATTTGTTTACATTTGC 

TGAATTCGGATTGGTCCGTGCAGGCTTTTCTGGGAACTCCAAATACATGT 

A7GACCTC7AGTCATACATGGCAAATGGCCGCTTGGCTCTAATTTGAATT 

TAGGCCCAGTTAGTCACTCAGGATTAACCTTTTTCAGGGTTCACAGCTCT 

GAACAATGGACTTAGACCTGCAGGACATAATCTGTTCCTAACTCTGGGAC 

TACCTGTGCCTTTTGACTGTGCCCAGTGAGCAGCTGTGGCTCTGGGCCCA 

GACCCACAGGGCGATAAGGCACAGAGGTACGCATGGAGCAGGCTGTCCTT 

GCTGAGTGATCATGAAGATACACTTACATAGAGCAGCACTTTTCCTTCCA 

GTCTTTGTGATTTAACTCATTAGATCCTTATAACAAGAGTCAGTCCTCTA 

TTTAACCCATGAAGCACAGGTGQAGTCCAAGCTTAGTTTGTGAAGGATGA 

GCCAAAAGGATTCTTCTCTTGTAGACCTCAAGCTCAGCTCrCTCCATGGG 

CCCTGGAGTAGGTGAGAAGGCCTCTGTCTTCCAGAGCCCACTGCCAATCA 

tctacattttctgttagcccaattctaggacattgctttaccaactgaag 



7CTACA77T7C7GTTA(X:CGAATTi: i ;a<j«j<m-h. liw." <-«™- * — 

GCTGAGAACTATCMAAGITATA^ 
AGAACAGAAAATAAAACSATGAGAATCTATTAAACATAGTGATOTACTGG 

AAAAGGGGGTCTCAAACCAGACCCCAAGAGAGAGTCCTTGGATTTCACAC 

AGGAAAGAACTCAAGGTGAGTTGCAGGGTGCGGTGAATTGAGAGAGTTTA 

TTGAAAGC7ATTCCATTACAAAGTAGAGCATCCTCAGACAGCAAGTGGAG 

mn« xm » .~.i.ii.|»r r TTnT&T&rtrtA&TC'ri.'tjTCTATATAAA 




GAC AAAw i AAw*' x wl\3U^ a^^x w ± w ^ www * www - — — — 

TTTATTCTCCTATTGATTTAAAGAGAACTATCCTTGACATTTTAGTGTGT 
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rTAAGTACATCAAAGCAlAACTATAATTArCTTGAAAGCATATArtTr^ 
TAGGGATTGGGACATCTGGGCTTTCTGTTGTTGTAGAAGTTTGTCC^TGC 
AGGGATTACCAAGCCACTTCCTTAGCTGTAAACATCTTAGGGCCATGGGT 
CCTGACTGGCAAGGAATGTGTCTTGCTAGTTTTAAGATGGGCTTGATTTG 
AAAATGGTGTCCATCTGGCTCTCCTAGGCTCCTGCTTTCCTAACAGTAAG 
GGTAAATGCTATGTTATGAAATGTCATTTCTGCCTTTAGCTTGCAAACTC 
TTGATGGTGAAATTCTCCTGTCCGTTTTCAGTGGGGTATTTATTCTGCA T 
CCACGTCTTCACAAGGAGCTGAAAACAAATTGGATGGAAGCAACTGGGT 1 
TTATGGGACACGTTAATGTTTTAATGTCATTTGGTGTGGAATTCAGATGT 




TTTCTATTATGGCAGCCCTATCCTGAGGACACAAATTTCTGCAGGGCT^C 
TGGCATATCTCTGATTAAACAAATGTCAACAAGGTTAAAACAAATGTCAT 
CTCTGATTTGTTTGTTTTAAAGCCTGGATTTACTCATTGAATATTTCACT 
CCTACTAGCATGTCTTGTAGTAGTTTTCTTCAGGGACCCTAATTATTGCT 
ATTAAAAATATGTGTGCAGCTACATGTTTTTTTTTTATCAAT?TGCAATG 
AAAACTTTAATTGAATAATCTATTAGTGTTATTATTTGAAAGTGAAATCT 
TTTCCTTTTGCTTTCTTGTTCTCACACATAGTGCAGACAGTTTCCACACG 
GGCTCATAAAAGGAATGATTCTGCCTTGTGTGAACTTTTTGCCTTTATTG 
TTAATTGCACCATTTTGTGACTGGCTTCTTGACCCTGTTGTAACCAAGCT 
CATAATGTACATTATTTCTTATTTTGCAGTTGTAGACACTTGAGGXAGTT 
CCCATTCTTTGTTTCTTCTTGCTTTTGTTCCCTGTGATAACTTTTTCATG 
CAGACATTTTTTTTTTTTTTTTTTTTGAGACCGAG7CTTGCTCTGTCATC 
CAGGCTGGAGTGCAGTGGCATGATCTTGGCTCACTGCAACCTCTGCC^CC 
CAGGTTCAAGAGATTCTCCTGCTTCAGCCTTTCTAGTAGCTAGGATTGCA 
GGCGTGCACTACCACACCCAGCTAAATTTTTCAAATTAGCCACCCCACC^ 
GGCTAATTTTTGTATTTTTAGTAGAGACAGGGTTTCAACCATGTTGGCCA 
GGCTGGTCTCGACCAGGTGATCCACCCGCCTTAGCCTCGCATAGTTGCAG 
GTGCTATTCTGAGCTCAGGGCTCTGGCAGCTACAAGCCCAAGATGCGGTC 
TCCAACATGTGGCCATTCAATGTCATGGCGCCCTCTACTGGTCCTGGGAA 
GCGCAGCTCTGCCAGTAGCTCCAGCAGGGCACAGCTGTTAAGTCGTGATG 
TTCTACAGGTGACCAAAGGGCAATCTCTGGACTCCTTAGCCGCTAGGTCC 
TCTCTGTAGCAGGACCCAGGAGAAGGCAGGGGCTGAGGATGGCTCTCTTA 
GACATTTGTGATGAACCAAACGTGTGCATTCATGAAACTTCTGTGAGCAA 




GTAATGGTGTGATTGGGTTTGCGTTTTAGGAAGATTTCTTGGCCAGAATG 
AGGCGGGCAACCCAGAGCAGGGAGTGGCCACATGTGGGTGTGCAGTTATG 
GGCCACTAATCCAGGTGATAAATGGTGTCTCTGAACTTCAGGTGGGGGTG 
CCACATGTCTCCATCTGCTCTGTACCCTTGAGACTGGCCTTATGGGCTGC 
CTTAGTGGTCTGTTGTCCTCTATCTCCTGGTTGGGCTCAGGCAATGGGAG 
ATCAGAGGGAGGAAAGAGAGCTTGGTTAGAGTGCACCCGCGCCCCTTCAG 
GTTGGCAGTGGCCACATTCCCCTATACAGAAGGCCACAGTTTCTGTCAGT 
GGCCCTCCCACAGCCCCAGCTTTCTCAGTGGGCCAGCCACCTCCCCATCC 
CTTGCTCCTCCTCCTCCAGAGAGGGTTGTGGATTTCCACTGTCAGCAGTG 
CCTGGAGCTCCACCATCTCCTGCTGCTTCCTCTGGACCTGCCTGCAGTTT 
TATAAATAACCTTTCCTTACATTACCTCTAGCATGCACCTTTTGTGTGTA 
TACTCTGCCCCCTGTCAGCACATGACTCATGCCAAAGAGTTTGAATTTTT 
TTCTCCAGGCAACGGGAGGTCATTGGAGGATTTTAGACATTGAGAACAGA 
TGTGTATTGTGGAAATATCTGTCTGACTGAAGTGACCAGGATGGTCCAAA 
AGAGCGAGAATTTGAGGCAAGCAAACCATCAGCAGGCCAGCAGCAGAAAT 
CCAGGTCATAAACAGGGAAGCTGAGGCTCACAGGGTTGGATCAGGGAATG 
GGAGAGGGAAGCCAAACAATTCCATGAGCATGTCAGTTGCACATATGACT 
TGGTAACTATTTTTATTTTTATTTTTATGTTTTGAGACAGAGTCTCGCTC 




GACTACAGGTGCGCACCACTACACCCAACTAAGTTGTGTATTTrrAGTAG 

AGATGAGCATTCACGCTGTTGCCTTAGACACGG 

>Concig47 

AATATTGATTATTTGACCAGAAATTCATGCAGCTAACCGTGACCCCTGGC 
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JjVAATAAAATAGTGTAT*.- GTACGTGCATATACATGCAAAGAAATfSAG'i. . 
GAAACTAGAAGGATGTCAATCAAATGATAACATGGTCATCTTGGGGTCGG 
AGTACATTTGGGGATGAGGGGAGCTGTAAAAGCAGACTTGGACCTTTTCT 
TCTACCAGTACCGTGTCATTTGAATTTTGGAAAGAAAAAAAAAAACTCAG 




CTGCTGTGGCTTTGGTGAGACGTGCACTCAWic i Ufu-u ijii.n.ui*w 
C T CCAGTAAGTACACATGAGCAGAGAGGCCTCAGCTCAGCTCTTCCTGGT 
CCCACCAGGGTTGATTCTTTGAGAATTCTAGAATGCCACATCCTAGGCCC 



CCCACCAGGGTTGATTCTTTGAGAAI it TJ.AVa«AivjuuKv^viwv. i « M wv.v.w 
CCCAAAGAAATCCTGCATCTTACCCCCAGAAATATGAATCATAGCAAATT 
TCAMTCAACCATCGTTTAATACTCAGVGACTGGGCACATCCftAAAACAT 
ATTTTCAGTTTTACAACAGTGCCTGGTGCATATCGGCACTATTTGTGGAA 
GCAATAAATCGACACGGAGCTGAAACACAAACAAATGCCAAATTGTTTTT 
ATAACACCTGATTTTCTTTCTGTTTCTTTATGCAGTTTAGTTTTGTTTTG 
CTTAACTC7ACCTCAGACCATAGTCTGGTAAACTCACCACCCAGAAGCTC 
» R ^^^r^i-T»Tr.rar:rrarTASGTGGCAGGAGAGAGTTTCCTGC 



TCCAACTTAAGGTTCTGAACACAGACGCTTCCCCAGAAAGCCATTGl. lit 
TCAGCACCTGGGAGCCTTGCrTTGCTTTGCTTACAGACTCGCTGTTCTTA 
AATCACTGCCAAGATAACATCTGTCTCTTCTCTTACCCTCTATTTCGATA 
TAAGGACTCCT^CTCTTGTTGCTTCCTATTGGCTACCTCTCCACAGGGA 

AGGGAAGcb^TTTAAAAATTGGAGAATTTAGGCCGGGCACM 
CCTGTAAT CTCAGCACTTTGGGAGGT CGACGTGGATGGAT CACTTAGGAG 

"?2?Z~""r~n'Tn'vr> a rinrrr a rj-Ta rTrar^AGGCTGAGGTGGCAAGACTG 



^TTGAGCCCTGAGGTCGAGGCTGCAeT^«tuuHi^iv^^w^w*w^- 
T^CAGCCTGGGQ^CAGAGTGAGACCTTGTCCCAGATAAATAAATTAAAT 
^AAT^AATTAGAGGATTTAAGGATTTTCCCTACAGACACCTCCTTATTT 
"CTCTGGCCTrTrCTGACTACTCTCCCTAACTCCCTGCTCCTCTGGTCTC 




A^CAGTT^AGCAAAAACCCTCCTAAGCCAGTTTATCA^GATCCCCT 
^CTCA^TATCCATCTGATrGGATTCTTCATCCCCCACCATTCCCCAGTGA 

tc?c^Sggccttt?ttcag^ 

rcC^CTCACCCCTGATATGCCCTTTTAGTAATTCTTCATCCACAGGTTC 

A^GAGCCCAGTTCTATACTGAGGTCTrACTTCACCrCTCGCCATAGT 

TGAATAWIATTGGTTTTCACATTTAAAAACTGTC 

TTGACACAGGGTAATTTTTATTCC^^ 




FIG. 4 (28 of 61) 



WO 99/06426 



PCT/US98/I6102 




3AGATTTAGTTTTTGGCCACAGTGCAAAATAAGAAACGAGGCTTCAACTG 
^^iI^55I^5TI^ AGaAA ^ TG ^ ACTCCCTTG AAGGACCTGTGAAG 
i G*G x i. <_G*_ i ATGAGAAAATGACCAGAATCCACGTTCTTAGCTGCGGGAC 
TCAGGCTGACTCCTGTTTCTGGAGCTTGCACAAAGGGCAGGGAAATCCCT 




~ " ""11 ~ 1 UftAAUCAGCTACATGGT 

CTCTTCAATGTGGCCCAAATGTTGGAGAACAGAGCTTAACTGAATCAGCA 
ATTCTATACTTAGAACZTGACTCTCTCTTTATTATATCTCACTACTACCTT 
GATATTTGAAATATTCAACTTTTTTCAATCAAAAAATAACAATAATTTAG 
GCATAATGACTACTATGTCATTTAATTTCTTGCTGATATTTCAATATCCC 
ATGCCAGGAATATTGAAAGCTCAGCTCCTTAAGAGCTGACTATGGCATCA 
ACTCCCAACAACCATCCTTCCAGAAATATTTTCCCCTTTCTTTTGTTATA 




wiwnni x uw\u l UlAAhU 1 U i AAAAUAUATACC 

AAATTTCAAAGACATGGCACATAATAAAAAATGTAAAATATCTCATTAAC 
AATTTTTATATTGACTGTGTAAGTAACATTTTGAATATATTGGATTAAAT 
ACATGGATGATGCCCCAACACCCACAGTCCCTTATCAAGTCTCTACTTCA 
CATTTTTGTACTTCTGACTTAGAAATAGCACTGGCGTCTAAGAGCCTATT 
AATGTCGTCAATAGGTTCTTGGGAACCACAATTTTAAACAAAATGACATA 
TAAGAAAACGAATAACATTGAACAAAATGACATTATTCGAGGACCTGCTG 
CATGTTGTTTCACTTAAAGTCAGTGTCCAAGAACCTATCAGTGACATTTA 
GTGAGGACTTGCTGTCCTTCCTGTTTACAGGAACCTGGGCAAGTTACTTA 
ATTCCTCTAAGCCTGGTTTATATCCCTGCAAAGAGAGAAGGATAATAATC 
ACCAGTACTTAGTGATGTCGTAAGGAGAAAATAAAATAATAAATATGAAA 
TGGCTGACAGTGTCCTTGTCACACAGAAGATGTGTGATCCACAGTAGCTG 
CTATTGTCTGCCTCACTTCACTAGTAATGGTCCAGGGAGGCCTTTAATGT 
GCATGGTGCAGTACATTCACATGTTGGACATGGGTGAAGGGAAAGACCAG 
GCTCATCTAAACACAATAGGATGCTTGTGGTGTTTTGAGGAGGAATCAAG 
GACTAGTTATCCACAGCTGTAACATGCATGGATCAAAAGAGATAAGGCAC 
ACAAAAGACTTTG TCAGTAGCAAAGCATTACAAAATG CAGAGAC CAGCTG 

TGGGTGGTGGTGAGTCAGACCCAGCTTCCCTCTGTGCCTGGCTGAGTGGT 
TCTGGGCAAGTCACGCCATCTGTCTTGATGCCCTTCCCCATCTATAGAGA 
GGGAGCAACTGAGGCCCCTTCCAATACTGAAGTCCTTTATTTCTGCTACT 
TTAGAAATATCCACATTTTTGGTAAATTCAAATGATCCAATGATTCCATT 
TCCTAATGTTCAAAACTAGCCCCAGAAACATCTAAATGAATCAAACAAAT 
AAAATATTTATTGTGTATGTTTTGATTGCTGAAACTTCTATTTTAGCAAC 
ACACACACACACACACAGAACCCATAAGCCTTCATCTTTCCTTGGATAAA 
CGAGCCTTCCTGTCTGGCCATTTAAGTCACGATTAAGTAAATGATTTCCA 
ACTCGCCTTTTGCAGCAGTTCaGATGGGTCTTTCCTGCGTGGCAGTGGCC 

ctcctgacttatgatttcctgtgtgtcggcetgttaccactgcagcttaa 
ctgaggaaacaagaacaaaacagcttctgaccccaagagactgttggagg 

CAAAGGCTTCAGTCCCAAGAACCTCACACGTGGGGAGCCCGAGAGCCCAG 
C CCTGA CCTTTTCTCCAGTAATAACATAAGAAACAACAGGCACTGGCCTT 
ATTTTGGATACAAAGAGTGGTGCTTTTCCTTAAATCTTCCTTTAGTCAGG 
GCTACCCCTTCATGGACGCCCCAACATCCATGGTTCCTGCTTGAGTCCCT 
GCTTCCATATTCCTGCACTTCTCACTTGAAATATCCCTGGAGTACGTTAA 
GCAGCCAGGTTTGGAAGTTCTTGCTGTGCAGGCGGGTGTGTGCATGTCCT 
CTCTCTCAACAGGACACAAGCTCCCCAAATCAGACGGTATGCCTCCACGC 
CCCTTCCCAAGCCTCCCCAGCAGCACCGAGCATGTGAGGGGAGCTGGGGC 
CCAGGCCATGATGGGAAGCACTCTCTGCCTAAAGACTAGGGTGATGCGCC 
CTCAACTGTGGGAATGAGCCCCAGCTCTGGTGTCTGCCTCGGTTTTTCCT 
CCTGGACAATCAACATGAACTCCTCACCCCTCTTATCCACTTTGCATAAA 
CTGAAAATAACAAACCCAGGGCTCTTTCTGTCACAGGAAAGGGTTTTTTT 
TTATAAAATTAAACAGAGATGATTCAACACACCCAGGATATAACACATGG 
GCCATGAATCAAGGGCAGCATTGCTCTGGTCAGCCTGTTGTTTGGGCCCC 
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CTTGGCAGGGCTCTCCCCi'GAATCTTCCCCTCTTGACTCCCATCANCACA 
GCACrCCANCTTTGTGTTACAGGCGATAAATGGGAAAGGGGTAAAT 

>ConxiQ4 8 

CATTCTTAATTAGAGAAACGCTCATTAAACTAGACACCCAAATTCTCTGG 

GGGGGGATCATTCTTACAAGCATGCCCTTCTCTC7TAAAGAGAGAGCACT 

TTTTTCGCAAATAATGCTGCCATGAACATACGGGGTGCATGTATCTTCGT 

AATAGAATGATTTCTATTTTGGGGGGTATGTACCCAGCAATAGGATTGCr 




"CCTATTTCTCTGCAACCTCGCCAGCACCTGTTATTTCTTGACTTTTTAA 
^AATCGTCATTCTGACTAGCATGAGAGACAGTATCTCGTTGAGGATTTGA 
TGTGCATTTTGCTAATGATCAGTGATGTTGAGCTTTTTTTCATATGTTTT 
TTGGCTGCAAGAATGTCTTCTTTTGAGAAGTGTCTGTTCATGTCCTTTGC 
CCACTTTTTAATGGGGGTTTGTTTTTTCTTGTAAATTTGTTTAAGCTCCT 
TATAGACTCACAATAACAAAGACATGGGATCAACCTAAATGTCCATCAAT 
GATATAACGGATAAAGAAAATGTGGTACATATATACCATGGAATAGTATG 
CAGCCATAAAAAAGAATGGGATCATATCCTTTGAAAGGACATGGATGAGC 




CATGCTCTCACTTATAAlj 1 SjVrtaAU V. i <j«m»iv~ i wiwm«_j-iwwvr>*»»w« ~ • 

GAGAGGGGAACAACACACATTTGGGGCCTGTCAGGGGTGAGGTGGGGGAG 
GGAGAGCATTAGGAAAAATAGCTAATGCATGCTGGGCTTAATACCTAGGT 
GATGGGTTGACAGGTGCAGCAAATCACTGTGGCACACATTTACCTATGTA 
\CAAACCTGCACATCCTGCACACGTACCCCAGGACTTCAAAATAAAGAGA 
GACAATACTTCTCCCTTAAGTGTCTACTGTTGCTTTGCAATAAAAAGTTC 




GTCAAGAACGTGGACACTGGCTGeeeCTtrtiAVaAU i laluiu^iwuuv^w 
ACCCTCCTGAGCCCTCCAGCAATACAACTTTGACACAAACTATGAAATCA 
CAGATCCAAGAAGCTCAAAGAACCCAAGCACAGGAAACATGATGAAACTA 
CATGAAGGAACATCAGAATTGAATTGTTCAAAATCAGTGATAAAGAGTAA 
ATCTrAAAAGCAACCAGAACAAAATATCCATCATATACqCAGAAATAAAG 
ATAAGTATGACAGCAGATTTACAAATAGAAAAAAAAACAAGTGCAGCAAC 
AGAAACAAACTATCAATCCATAATTCTATACCTAGTGAAAATTTCTTTCA 
,*,-«tv** » * t» a & & & & aTTiTTTTf^GGAATACAAAAGCGAAAAA 



AAACAAAGGTGAAATAAAAAAATTATTTTCAGGAATACAAAAGCGAAAAA 
ATTAATCACTAGCATTCATCACTGCAAGAAATGTTAAAGGAAGTCCTTTA 
GGCAGAAAGAAAATGATACAAGGTGAATATTTGGATCCCTGCAAGGAACT 
AAAAAGATCCAGAACTGATAACTTAATGGGTAAACATGTAATTTTCATCA 
^V?___. . » n * t<« » t i" & ti t aTrr &T&TGATAGACTACTA 




"TAGAATACAAAAGAAGAACTAt, ;TiAi.VjV>Ki.u iwinnwi j. 
^CAAAATTATTATTGAGTGAAAGACACCAGATCAAAACAAAGTACATAC 
^GTATGATrCTGTTTATATAAAACTCTATAAATTGCATGCTCTTCTATAG 
*GACAGAAAGAAGATCAGTGGCTGCCTGCAGACAGGAAGAGACTACAAAC 
GAAATGAGAATTCCTTAAGAGATGATGGACATGCTCATTACCCATCATA 
GTATACAGCCATAATGGTTTTACAGATACATATATATGTACACGCCAAC 
lTAAATATAAGTTATGVAATTACAGTAAGTTCTGACTTAATGTCACTAGG 
, - V! */*T«w»Krtr&aaaT«a 1 Tf3T^CAGTGAAACCAATTT 




taccataggctaattgataxaaauai ^ i iAov. i. i. i. 
ttttgacatgaagtctcgctctatcgcccaggcaggagaagaagagttag 

gttttacagcatgtttctggtcacaagaacatcatcaaacttgtaaataa 

AGGCACAAAACACTTCTAATATTAAATATCAAAATAAATATGAGTTATAC 
AGAATTTAAGAAAGATTAATAAAAACAAGTAAAATCATTAOTATGG»T 
TTTTGGTAATCAGTGAGTTATGTGGTCATAGTGGAAGTGGGTTAAGTCAA 
GAAATAAATGTTTGCAAAACAAAAATTTTAAAGATCCTCTCCTACCACCA 
CACAAAAAACAAGAAAACACGGTGGGCTCGCrAAGCACTTTTGTACCACT 
CGTATCTTATGCGTTTGTATGATTATTGTAAATGCTTTATGATAATTTTT 
AGAGACAGGGTCTQ^CTCTGTGTCTCAGGCTGGAGTGAAGTGGTGCAATC 
^?^,^^;^Xrr&*rrTrrrr^aTTrAAGAGATCCTCCCACCTC 




AGCCTCCAGTGTAGvJTAvjwAw iAuw uuwi 

CTTCTTTTTTATTTTTTGTAGAGAQ^GGGGTTGTGCTTTGTTGCCCAGGC 

TAGTC^CAACTCCTGGGCTCAAGCAATCCTCCTGCCTCAGCCTCCCAAA 

ATGCTGGGATTTCGGACATGAGCCAGCAGCACCTTGCCCAGCATTTtATT 
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agaaaccagccctggacaggacaccatcctaccgSgg^SacttaS? 

tcjgcgtaaatcwcacagccgctttggaaaacS^cagttoSg 
ttaaatatacccaaactctatgatcca^tc^^ 

aataaaagcaatgtctacacaaagatgtatacacaSg^^SS 

GAAAACAATTATGCTAAGTnaa a ^^^Zlirr?!?^*? 7 




GGATTGCAAAATAGCACAAAAATATTGC»GG<^TC3ACA^A^C^ 

atcttgattgtggggatagtttaatgggtatata^gagatSaagctca 

TCTAATTATACACTTTAAATATATGTATTTCATTGTGCATCAGTTATTCA 

tcaacaagactataaaataatatatgcctacatacatttttaaataSS 
^ tc :^TT atata ^ t ^ tg ^ ct ^tatgtat^gatcS 

UGCAGAAAGGACTGATTAAACTCATS&r 1 r.rr^«-T-T^i^i 




•-cttg i.aatgtttcctttcttttagtaaaaatatattgacagttaaagcr^ 
gagaggtga(3aataatagtctcatggcttttgtgtccSa5aat^?a 
aactaagtgaaatgggagaaagcaaaaaaataaactt^^ 

ATTGCCCAAAAAGAGATTTAAAATGGAfltrrT'a^r ft r a^s^^iST/l^Z 



iv.j.vj\_Mvjou«Aiji rTAACAACTATAAAGAATTAA 
AATCTAGCTTCTACCAGCCCAAAGCCTAAAATGTTCTGCTTTATTCTTCC 
II A II AT ^^^ TA ^^ TATAI ^ATGTTTGCAAATGAATGCAGTG 
ATATTAGATCTCTAAGAGGTGCTAAAAATGAAAAGTACATATTCCAATTT 
TTCCCAATTTTCCTTCTCTTTCCATGAATGAAAAATATACATATTTGATG 
ATTTCCAAGTTTATACAACCGATCTTTCTCTTAGTTTTCTCTTACCAAAT 




, r * ~i?rr? : W * v - A1 - 1 <- 1 ^tcaggatcctgcctgacctgcgagg 

AGCAGV.AGCAAGAAGGAGACAGAACCTCCACGCTGAGCATCTCAGGGCTT 
TCTCAGAGACTCCAGAGGACCrTRATaiwia r*a n » ««~f.^^^«« . — 




?.X:^i^:ri 1 ^^uraocactggtaGGAATACAAACAGGGGAGCC 
AACAGCCTATAAATAATACTTTAAGAAAGGGCATGAATGTAATTACTTAG 

gaacaaaaggcaaagtggagagatgcctaggactqagctggacaagctgc 

ACCCTTTAGTGGCTCAGCCCATGGGCTGACAAGGAAAATGGAGGAGCTAC 

CAAAGAACX3TGGAAGGATTCTGGGAGAGTGGCCCTCACCCTGCCCAGGGC 

AGGGCTCAGTGGGAGAGAGGGAGATCTGTTATAAATGCTGCCAGGAGGTC 

GAGTCATGTGAGAATGTCCATGTGAAAACATCCACTGTGTGTATCTAAAG 
AGAGTGGCTGTAAAACAGGTCAGGGTr artrrnr«T-rfi«i»Tw™™ ^« 



.Vii: ^■ L ^- rtiiUi, - L ^SACCAAGAAAACTAAGGAGCATGGACACA 

AAGGGTTAGGTTGAAGCAAAAATTTAATAAGTGAAAGAAGAAGGCTCTCT 

GCAGTGGAGAGGGGAGTCTGAGTGGGTTGCCACrTTGACAGCTGAATCCA 

AAAGCTTTTATAAGAAACTCTTCTCATATCTGCAGCTGTTTGAGTAACTT 
CTCTTACCTftTAAA ArTnTr-rrvra tj aww^<««m« — 




— — *- * w» iaahm i i A^CTTCTCTTGTTTGTATAACT 

GTCK3GTTTGTTTTAGGCRAGCCCCCATCCCCTCCCTGTGTAAGCTCCCAT 
GGAGCCCACCATGTGCATATCTGAGAAGTGGAGGAAGCTTTCTCTGGGAG 
CTCACTGATCGTACAAAGAACAAGAGGCTTCTGTGCCGCTTATCTATTCA 
G E G i AGCCTGAGT ^ CCCaTCCTGCT CTATTTTTGCCTGTAGCTATG 
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^^Jivji • • » *oA«nvjwwwo* * - — 

GGATCCAGTTCAGACAGGAGCACCCAATATTCAGAAGAGAAGAACATGGT 

GTAAAGGTCCTGGGAAGGCTGAGAGGATTGGGACTCAGAATCCAGAGCAG 

AAGCCGTCTGTGAACAGAAGAAGGACCTCCCCCAGTGTAGCAAGAGGGAG 

GGAGGAGGGACAGATGC CAAGATGGTTCAGGAAGAAGGTTTGGTGGTAAA 

TGJGAGGCTGTGCTCACCTGCTGGCTTCAATTTTCTCTTTAAAATGTCAG 

ATGGAATCATTTGATGAAGGCCATGCCATGCAATGAAATGGCAGTCTGAG 

GCATGGAGCAGCTCCAGCTTAGCCCGTGTTTAGGGTAATTATGGCTCCAA 

CCCAGGAGATGAATATGACTAGGGAAAGTGAAGTCCAAAAACAAATGGTC 

TCAAGTTGACTGTGAGTCTTCTGGGAGGCTGAGACGACAGGTGGGGTTGA 

CAAGGGAAGGGGAACCCACCTGCTGAAAAACATCAGGCTGTTGGCTGGGG 

GAGGGGTGAGGCCTGTGTTGTAGAGATGGATGGATGC CTAAAGTTGGGTA 

AAGGTTTCAACTCTACCCTCTGCTGGGTGTGGAAATAAACAAAGACCACC 

CAAATGAGAACAAACAAAGACTATTTATCCAGAGCTTGCTCTGACAAGGG 

AGTCGGCAACCATCACTTGCTTGGCAGAGACTCAGAAGTAAGCAGGGGAG 

AAAGCCTCATAGCAGAAAGAAGGGAAGTCTTCATGTATGCCCTGAGTGGC 

AGCTGTAGATGTGGGTGAGTTGCAGGTGGCTAACTAGAAATGGGGGACTC 

CTGTGTGATTGATTAGGAGCATGTTTGGCTTTCTCTGGTTGGTCCTACAT 

TGGAAGAGGGAACAAAAAATTTAGGGCAGTTGTCAGTTATTAATCAAGTG 




AGGCTTC CTGGGTTGGCTATTGTGGATAATAAGTGGGTTTCCTGAGCTGA 

TTTCTGCAGATTGTGGATCAGAGTTATTTTATATAAACAGTCTGACCATT 

TTCCACTGGCATATTCCATCTTCCAAGAGCTGGCCAAGCTGCTGTCTTAT 

CTGTCTCCCCCAGCCCCTCCACTCTGGCTGTGAAAATACAAGCCACTAGG 

TGAGGAATGGGGACAATTGAAGACTGAAAGCTTTTCTTTGCTGGGTTCGC 

AGAGCTGAGGAAAGAAATGACAACATCCAAGTGTCTGCCCTGGGCCAGTT 

TTAGGACTGTAGTGGTAATGCAAGGACTGTGTGAGTTTATATTTTCATTT 

GTCTCTCTAACTAAGGTGGAAAAAAAAAAACAGAAAATTGTCTGTCTGCA 

GTCTCTGCAAAAGTCTAACACTGTGCTTCCCAACATTGCAGCCATTAGCC 

ACAGGTGAGTATCAAGCACTTTAAATGAGACTGGTCCAAACTGAGATGTG 

CTCTGAGAATAAAACACACAGCAGATTTCAAAGACCTAGTACATGCCCTG 

ATTTCAAGCTATATTACAAAGCTGTGGTAATCAAAACAGTATGGCATTGG 

GAAAAAAATAGACACATTGGTCAATGTGACAGAATAGAGAGCCCAGAAAT 

AAACCCGTGCATGTATAGTCAACTAATCTTTGACAAGAGTACCAAGAATA 

CACAATGGGGAAAGTCTCTTCAATAAGTGGTGTTGGGAAAACTAGATATC 




CCTAGAAGAAAACATAGGGAAAGAGCTCCTTGACACTGGCATTAGCAGTA 
ATTTTTCAGATATAACATCAAAAGTACAGGCAATGAAAGCAAAAACAAGT 
GAGAGTATATCAAACTAAAAAGTTTCTGCACAGCATAAACAATCAACAGA 
GTAAAGACATGACGTATGGAATGAGAGAAAATATTGACATCTGACAAAGG 
GTTAATATCCAAAATATATAAGTAATTCACACAACTCAGTAACAAAAGCC 
AAATAACCTGACTTTTTTTTAAAATGGGCAAAGTACCTGAATAGGTATTC 
CTCAAAAGAAGACATAa^TGGCCAAGAGATGTATGAAAAGCTGCTTAA 
CATAACTAATCATCAGAGAAATACACAAATCAAAACAAGATATCATCTCA 
CACCTGTTAGAATGGCTATTATTAAAAAATGAGATAAGTGTTGGCCAGGT 
GTGGAGGAAAGGAAACCCTTGTACATTATTCATAGGAATGTAAATTAGTA 
CAGCCATTATGGAGAACAGTATGGAGATTCCCTAACAAAATTAAAAATAG 
AATTACCATATGACCCAGCAATTCCACTTCAAGGAATACATTCAAATACT 
ATCAGTATCTCAATAAGATACTTGCACTCCTATGTTCGTTGCAGCGTTAT 
TC^CCATAGCCAAGATACAGAAACAAGTTAAATGTCCATCAACAGATAAA 




CAAAATC wTuAUA T U lUAbA iAAWL L J. wwnwwo>-" * * . — 

TAAGTAAAATCAAAGCCTGACACAGAAAGACAAATACCACATAATCTCAC 
TTACATATGAAATATGAAAATGTTAATTTTATGGAAACAGAGTAGAATGG 




TAAAAAATACCTGGCACCAAAAAAAGAATGCAAAATGTCTCAACAATGTT 
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^^ UL iAruACGTAGGGCTGCTTAGGCTGAATGCTTGCAQAC 
AA^TGCTTGCGTGCAGGTGGGCTGTGAGCTGAGTGCrtGGGTGCWCTG 

agccattggcagctgaccctatttcttggaacattcgSStg^gc? 

TTTTAAT GTTAAACCGCCAGGTCAGTTTGAA1 1 m .Tr rn ^T^i-n^ 

TTTTTTTTTTTTGCCTTTAGTAGGACCTGCCGTTGTGAGACTATCTGAGG 
TAAATTAGACACCCTCCTGGTTTAAGTCACCGCTCCAGTGACTAGG^GG 

A^CCACKSCGAGCTGCTGCTTCAGGGCCTTTGCATTTGCTCTTTT^G 
CCCAAAATGCACTTCTCTCACTGTTCACATGATTTTTCTCCCTCT^nCC 

CTCTTCAAGATTTGAGGGTATGCACCCCCACCCCTAGCCTTOTATCCCT 

AGTTAC . . .ATAGTTCTAATTTTACTATTTTTTGTTTACTTCATCAATAC 
C ^TGTAATCTCTGGAAGGAACGTTTCTTTTTGTAGTGTATTTCTAGCAC 

ctagaacagtacttggcaoitggcaggtgttcaa^gtaSgSgaSa 

ttttctcaaagtocatcsgagtcttagaagtttgag^ 
^™?I^ a ^ ctat ^ t ^ tgctaat ^S^Sg^ 

TGGGGCAATTCTCAAATTGACCTGGAATCCTTGAQATCTGfSenar'an'rra 
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GGCTTAAAGGGCATGTTL-JAGGTACTTTTATTTCCAAATTCCCCAGTGGC 

ATCAAGGAAATCAGCATCTCTGGATAGCTCTACTAAGGCTTAGTTCTCAT 

TGTCCAATCTAGCTCCTGGGTCATGGGAGGCATTCAGGAAATATTTGAGT 

GTAAGAGTGAGTTGCTTTACCTCCAGAATATCCTTCCAATGGCTCTGAAG 

CAGGCTGTGGAGTCCTGCTGGCTGATCACAGTTCACAGGTGGCTCCCAAA 

CCTGTGGTCTACATCCATCCTTTGTCAGTGTCACTGCCATTGTCCCACAA 

AT^TCATTTGGGCCTAGCCCCTGGGATAGTAATCAGTCTTTACAtAGATA 

TACATTGTG CTTTACATCCACAGTAATTCTGAGTGGACCTTAAAATAAAT 

TCCATGTCAGGTCTCACCAGCCCATGGGTTACAGATGGGGTTACCTTTCA 

GCCTTGTAAGGTGCCCCGTCTTTGAGTGTAGACATGGACTCACAACGAGT 

CCACTCCTGCTGTTCCTCTGCTCTTGCTGAGGCTTCTGCTGCTGCTGCTG 

CTGCrTTGCAGAGGCTGGCCAGCTGTGGTGCCTGAGGCACCTGTGTCTTC 

\CAGCACCAACTTGCATGGTGGCCACGGTGTAGTTGGAAAGGGATGCTTA 

GATGGGAGGCCAATGGGAGCTGCTTCAGGAGGCAAATCCAAGTCACAGAG 

ATCGAGTCACCGAGAGCATAGTAAACTCAAAATCCCTTCT TCTG CTTAAT 

AACTGAGATGCTGTCACTGGGTTAACCTCACCAAGCCTTGTTTTGTCTTC 

ACTTAGAGTGATTTCTGTCTTAGAAGGCTCCTCATATCCTTCTGGGGAAG 

GCTTCTAGTGAGTCCACAGATAGCTGGACCAGGCATGTCCAGAAA TAAT C 

TGATTCTCACATTTGAGTTAGCCAGCGTTCCCAGCTATATCCCCATTTTG 

TGTCTATATAAGTTACCAAAGCCCACAAGGATATTAGGTGGCTCCTTAGT 

TTGCTTTATGATTATGCCTTGTGTGTGTGTGTGTGTGAGTGTGTACGCCT 

ATGAGGATTCCTTCTCTCCCGTTCTTGCTATGGCTTCTCTTCCCCACTGA 

^GGGCTGTAGTTCCCTGTCCTTTTGACTTTGGGCTTAGTCATGTGACTTT 

TTTGCCAAGGGAATGTGGGCAGAAGTAACTGGGAGCCAGTCCCAAGCTAA 

GGCCTTGGGAAGCATGGTGAGCCTATGCCAGCTCCCTCAGAACTCCTTCC 

CTTGGCCATGAAGAGAGAATAACCTGGATTGTACCTTCAGCCCATGTCCT 

AGAATACAAACATGGAGAATAATGAACTTGACTCAAAGGCTGAAGGGCAG 

CTGAGCCCACATGAGGTCAATTGAACTGCAGCTACCTACAGACCTGAAAG 

TGAAATAAACATGTATAAGTCTCTGACGTTTGGGGTTTGTTTACATAGCA 

TTATTGTAGCAGAAACTTAAATAATACTGGGGGCTAAATATAGTGGACCA 

GTGACAGCACAGAATGGTAAAATGGAGTGATTGTTACTTACATCACAACC 

CTTCATCTCTGTTGATGGACACTAAAATCAAAGTGGCAATTACT CAGAG T 

TGGGAGTCATTGAGTTGCATCATTGTTGTTTAGAATCATTGACAGTTTGA 

GCTCTAAGTGATTACAGAGATGGTTTCCTCAGCTACAGGTAAATAAACAA 

AGGCACAGAGAAGTAAAGTGACTTCTAGAGGGCTTCATTGATATTTAGCA 

GCAGAATCAGAGCTAAACAATGAGTCTCTCATCTCCAGCCTTTCTATTCT 

^GTTTCCTAGGTTGGGATTTTGGGAAATAGTGCAGAGAGATTAGCAGTAG 

'GACATGGAACAATGTGAGCCTCAGCTTCCATCCCTGAGGCTGCCTTCAT 

^^GCCAGGGAAATGTCTCTGTGTGCAGCCTTGCCCTCTGCACACAGTGTG 

^ATGGCCACCTGAATAAGTGTCCTTTCATAGCGACTAATGGATTGAAATG 

3GTGCTAGAGCAGTGCTTCTAAAAACTCCATGTATTAATCATCTAGGGGT 

CTTACCAAAAACGCATGCAGATTCTGATTCAGTAGGTCTGGAGTGGGGCT 

TGACATTCTGCACTTGTAACACATGGACCACACTTTGAGTAGCAATGTAT 

TAGATCATTCCAGTGGAAACATGTATGAGTGATGGAATGAACAGATATAA 

CTAATCCAGGTCTGGTAAGTGAGGTACTGATACATATTAAGTTGAAGTGA 

ATTTCACATCAAAAATAATGGTTACACAGTGACTTTTACTGCCCCCAAAT 

T CTTT C Cr T T TGAGTGGTTTCAAAGTGAACTGAGCCAGCCAGGTTAAGTC 

CCTGGTTTAGTGTGTGATTAGAAGATTTGATCCAGCTTTCTCCTCCTTCT 

AATTCTTTAAATATGCAATGGCCTTCTAGAAACTTGTCTCTCAGGCTCCC 

CATGAGCCACCTGTCTTAATATCTTCCCCCCCAGGACATTTCCTGGGTCA 

AGGAAGGAATCAGGGACTAGGAAAAGTAGAAAGGTTGCCTGACAGTGAGA 

AAC' r TTTTGCACTCCTATTTGTTCAATTCTAAAATGTGGGTATTGTTGGG 

GCTTCTAATTGGAATCTAACCTGAAATTCAGGCATGTCTAGCTATATATG 

ACCAAGAATrAGGATGAGTTCACTAGAAGCCTATTTTCAGGAGAGCGGTC 

AGTTAAATTGAAGTTTATGGGTTTATGGTAATGGGTT GGGGA GTTTACTT 

CATTAGCAATAGCAACGTTTTTGAATCAGAGAAGTGATTTTGAACACACT 

GTACATAGTTTTCTCACTTAGATTTATCTCTGGGTCAACCCTTGTTGGAC 

C^ATATTAGAATCATTTAGTGAAGAAAAGGTGGGTGTCATTAGGAAAAGA 

GCCATTTATTCAAATGTTCTGTTTGACATTAGGGCACTGGCAAGACTACA 

GAATCAATAGATATTTAAAAACAGCCAGGTGCGGTGGCTCACGCCTGTAA 
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TCCCAGCG7GATTTGGC .rTACTTTGGGAGGCTGAAGCGGGTGGKTTG _ 

TGAGCTCAGGAATTCAAGACCAGCCTGGTCAACACGGTGAAACCCTATCT 

CTACTAAAATACAAAAAATTAGCCGGGCATGGTGGCAGGCGCCTATAATC 

Z CAGCTACTTGGGAGGCTGAGGCAGGAGAATCGCTTGAACCCAGGAGGCG 

GATGTTGTCATGAGCTGAGATCGCGCCATTGCACTCAAGCCAGGGCAAGA 

ATAACAAGACTCTGTCT CACAACAAACAAGCGAACATACGAAACAAACGT 

AACATCCAAACTAGCAGGTACATGCCGTGCCAGTCATGACCCATGGTCAT 

AAGATGTCTACAGCTCAGGAAGCAGCTGCACAATGCCTGCATAGACAAAC 

TCTTATGAAAGCAGAATGTCCTGATGTCTCCATAACACATAACAGTGTAT 

GCTTTTATTATGGTCATACTCTAGCTGTGATGTACCTACGCTCTAATATG 

CCAACGATAGTTTTCTTTAAATCATCAACATAATAAATGTCATGCTGTCA 

GTCCCCCACATGTAGACATAACTTAGCTGGTACATGGATAAGAAACCTAT 

ATTAGATAACCTTAGGCCAGGTGTGGTGGCTCATGCCTGTAATCCCAGCA 

CTTTGGGGAGGCCGAAGCGGGTGGATCACGAGGTCAGGAGATCGAGACCA 

CCCTGGCTAACACAGTGAAACCCCGTCTCTACTAAAAATACAAAAAAAAA 

TTAACCGGGCATGC3TGGCAGGCACCTGTGGTCCCAGCTACTCAGGAAGCT 

GAGGCGGGAGAATGGCGTGAACCCAGGAGGCGGAGGTTGCAGTAAGCCGA 

GATCACACCACTGCACTCCAGCCTGGGGGACAGAGCGCAAGATTTCGTCT 

CCCAACCCAAAAANCNANNNNAAATTTGCACCCAAATCTGACTAATTCCA 

GAGCCAATTCCAATTTAGAATCGTTATATCTCCCTGGTGAACTGAAGCTT 

rTATCTTTAAGGAGACACACTCTTTATGTCTACCAATGCTTATTGeCTTA 

AAGTCCACTTTGTCAGATACAGCTGCTTTCTTTTAATTAGTTTTTGTGTG 

GTATATCTCTTTCCATCCTTTITCTTTCAGCCTTCTCCATTCTTACATTT 

TAGATATATTTCTTTmC TTTl ' TTTm 'GAGAGAGAGTCTCACTCTCTC 

GCCCAGGCTGGAGTAGTGCAATGGCGCGATCTTAGCTCACTGCAACCTCC 

ACCTCCTGGGTTCAAGCAATTCTCCTGCCTCAGCCTCCCAAGTAGCTGGG 

ATTACAGGAGCCCACCACCAAGCCCAGCTAATTTGTTGTATTTTTAGAAG 

AGATGAGGTTTCGCCATGTTGGCCAGGCTGGTCTCGAACTCCTGACCTCA 

GGTCATCCACCCACCTCGGCTTTCCCAAAGTGTTGGTATTACAGGCGCGA 

G CCAC CATGCCCAGCTGATTTTAGCTGTATCTCAAAAACAGCATGGGTTC 

TGTTTGCTTTCCTTATTCAGCTTTATAATGTAAATCATTTACATCAAACA 

TCTAATACACCATGGACTGTAAAACACAGCCATATTTTATGTATGAATTA 

AAAAAAAAAACACCACCAATTAGTTCCTGAGACACACACCTTAACAATAT 

CTCTGTGATGTGCATAAATCAATCACATCAGTTTCTCTGCACCTCAAAAT 

TTCTTTCCTCAATTCTCAGAGATATGGCAATTTCTCTGGTTTTACATTCC 

CAGAAGCAAAGAAAAAGTACACAGCTTCTTCAAGTCATGAGTAGCTTCTT 

TTTTATAGCTCTTGGTGTTTGCAAAAAAGATTGGAATTGCTTCACTAATA 

CTAAATTTTCATTCTGCTGCTCTGTTTCTATGACAAGTCAGAGGGCATCT 

rTTTGAAGACATTCTAAACAGCAATTAAACTCAAAACATGTAATGACAAT 

GACACACAAAACTCAACTGATGACCAAATGAAGAGTTCCAGCCAAGTTGA 

CACAAGCTGGCTGACAGAGCTTGTAATACACACAGCTTGGCATATGCCTC 

GCCATTTCAGAGATGTAAAAATAGGAATAAATGTTTTCCCTTAAATCAAT 

GAAATAGAGCATTTGGACTGAAAATCTACGACAGTTATAGTGTTTTCTAT 

TC ATTA TTCTCATTCTGTTTCTTCTCCC^ 

TA TTTT CTATC ATTT CATTTrrCTTCCTACTAGTTTGA^ 

TATTTTCTATTTTTTAGCACTTACCTAAAATTACTCTGTAATCCATGGAT 

CCTTAATTTATTTAAAAAACTAATGTTAATGAGTAGCTTTATTTTCCTCC 

CATCTAATTTA AGGCC GACAGAACACCTTCACTTACCTCAATCCTCTCCC 

AACTTACATGCTTTTAATGTCATATATGTTAATACCGTATACTTTTAAAA 

CTTTCTAAAATAGCATTATTTTATAGCATGAGTGTTCATTTACATTTTTG 

CATATATTTAGAATTTTCTTTGCTCTTCGTTT C TT C ' IT CTATTTATGACT 

CCCCTCTGGGATCATTTTCCTTCTACTTGAAGTACATAGTTTAGAACTGC 

ACTATTCAATACAGTAGCCACTAGCCATGTGTAGCTATTGAAGTTTAAAC 

TAAGTAAAATTGAGTAATATTAAAAACTCAGTTCCTTCATCTCACTAGCC 

ACATTTCAAGTGCTCAGCAGCCACATGTGACTAATGACTACTGTACAGCA 

AACATATAGAACATTTCCATCATGGCAAAGAGCTCTATTGATAGTGTTCA 

TCCAGAGTTTCTGTTCCAGGACCAAACTGAGGGTTGGGCTGCTATTTCTC 

ATGGCCCAATAACAAGATGCAGATGAGCTGGGGAGGAAGAGAGTTTT T AT 

TTCTGCAACCAGTTACAGGGAGAAGGCCTGGAAATCATCACCAGGCCAAC 

rCAAAATTATGACGTTTTCCAGAGCTTATATACCTTCTAAGCTATATGTC 
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TACGTGTAAGTGTGCATIvJACCTGAAGACGTAAGTGATTAACTTCTTTIh 

ATCTGTAACTAAGGTCTGAGTCCGGAAGATCTTCCCCTGGAGCCTCAGTA 

AATTTACTTAATCTAAATGGGTCCAGGTGCTGGGGTAATTACCCTTATCT 

TGT CC C C7GCTAAATCATGGAGGTTTGGGGAATTC C CTTTAGAGCAC CAT 

TAACCTGTTTGTTGAAGGCCTGGGAATTTCCTCCAAACCCCCATTAAACC 

TGTTTAATCCCAAATTCKjTTCCGTTAAAAATTCCCTCCTTAATTTGTCCA 

ATJTTAAAGGC C CAAAAAAGGCTGGGGCAAACTC CTGAATGGC CTTTGTT 

ACATTCCAACCTTTGTTTAAAAACACCGGTTTTTAATATTTAACTTAACC 

ATTTAATCTCTACTGAAACACTTGTTATATAAATCTGCATTAATGAGAAC 

TGGCCTGCGCCATATCTCCTTCTCAGAATATCTTAGGGTTGTGATCCCCT 

GTGTGAAGAGAATATATCTCTGGAGATCTCAATCTCTCTACCCCAAAAAA 

AATCTCACTCGGAGAAAACTCAGACTCTTATCTCCACAGCGCTATCTCTC 

TCCTCTCC 

>Contig50 

GCTTGTCTAAGATGGTGCTCCTTGTTGCTGTGCCTGCTTTCATCCTGGGA 

rCTCCCTTCACCATCAGGATTGCCTTCACCTCATTCCAGTCTTGGATCTT 

TCTTCTTGTTTCTTGAGTATTTTTTTTTTTTTTTTGCTGCATTCCCTTCA 

GTGGCCTCTTGGGAAAAGATGTGTAGGGAGAAAAATTTTCTTTAGAAACT 

TGCATATCTGACAATATATTTATCCTATCCTGACATTTGGTAGATAGTTC 

AGCTGGGTACAGAATTCTAATTAATTTTCCTTCCTGATTTATAAGACATT 

GCTCCATTTTCTTCTGGCTTCCAATATTGCTGCTGAGAAGTCTGACACCA 

TTCAAATGCCTGATTTTTTCCATGTGATTGT7GTTTTCTGTCTGGAGTGT 

TGTAGGATTGCCTCTTTATCTACAGTGTTCTGAAATTTCATGACGTAGGT 

CTTTCTTCATTCATTATGGTAGACACTCAGTGGGCCATTTAATCGGGAAA 

AACATGTGTTCTTCAAGTTCTACAAACTTTATTACTTCCTTTTTCTTGTG 

TCTTTCTCTGGTCTGTTTTCAGCCCCGAGTCT CTTAG ATCTGTCCTCTAA 

TATTCCTATTGACTTTACTTCATTTTCTAAGTCTTTATCCTTTTGCT TTA 

CTTTCCGAGAGACCTGCTTAACCTTATCTCCCAACTCTTTTATTGAATTT 

CATTTCTTTTACTATATATTrTTTACTTTGAATACACCTCTCTCTTCCTC 

ACATTTTCCCCCATAGTATTTTGTCTTCAATTGACAGTTCTACTATCTTA 

TTACTCTGGAGATATTAATAATAGTTTTTAAATTTTTATTTATTTTTATT 

TTCAAAACAGTGTCTTACTCTGTCACTCAGGCTGGAGTGCAGTGGTGTGA 

TCATGGATCACTGCAGCCTTGATCTCTGAGCTCAAGCTATCCTCCTGCTT 

CAGCCTCCCAAGTAGCTGGAACCACAGGCATGTGTCACCATACCCAGC7A 

ATTTTTTTGTTTTTGAGGTGGAGTCTCACTCTGTAGCCCGGTCTGGAGTG 

CAGTGGTGCAATCTGGGCTCACAGCAACCTCTGCCTCCTGGGTCCTGGTT 

CAAGCAATTCTCCTGCCTCAGCCTCCTGAGTAGCTGGGATTACAGAAACA 

CACTACCATGCCCAGCTAATTTTTGTATTTTTGTAGAGACAGGTTTCACC 

ATGTTGGGCAGCCTGGGTCTGAACTCCTGACTTGTGATCTGCCCACTTGG 

GCTCCCCAAAGTGTTGGGATTACAGGCGTGAGCCACTGCACCCGGCCACT 

AATTTTrAAATTGTTAATAAAGACGAGGTCTTGCTATGTTGCCCAGTATG 

GTCTTGAACTCGTGGGCTTAAGTAATCTTCTGCCTCAGCCTCCCAAAGTG 

TTGGGATTACAGGTGTGAGCCACTGAATCTGA CATTTTTT AAAAG TTTT C 

TTCTCTTTACCAAGTCTTTTTTCCCCTTTCTGCrriTnGGGTTGTTTTA 

TTTTGATCTCTATCTTGCTAGAAACTTTCTGGAGACGTTTAGTAATACTA 

GATTTTTGAGAGTGGGCAACTGGAAAGCTGATTGGAAACTCTGAATACAT 

GGGTGAGGCTTGTTGGCTGTGAGTGTCATTGCTTGATGTCCTGGCAAGGC 

CAATGGGTTTGGGACCCCTACTATTAGTATAGGCCTGATTCCCTGGGAAA 

GGCTCTTTTGATCTCCTGCCTGGAGGATAAAGGCCTGGCTACCAGCCTTC 

TGTGTGTAATGTGAGGGAGAAGGGCTGGAGTATTCAACATCATGCTGAAT 

CCTTTCAATGATCATCTTGTTTTTAGTAATCTCCTACCTTAACTCTCTGT 

CTTCTGCTAGTATGGGAAAGATGACCTGAAAATCTAACCATTTATTTTTC 

CCCCATTAATATCATTTTATGATTATTCAGAAGTTAAATAA TTGT CATGC 

TGTCCTCC^VAAAGACTGAATCAACTAGCAACAAATAAGAATTTTCTCAC 

AGCTCTGCCAGCATTTTAAAAGAATAGCTTTATTGAGCCCAGGAGGTCAA 

GGCTGCAGTGAGCTGTGATTACACCACTCTACCCCAGCCTGGGTGACAGA 

GCAAAACCCTGTCTCAAAAAAGAAATTTAAGGAACAGCTTTATTGTTGTA 

AAATAGACATACAATAAACAGAGCACATATTTAAATTGTGCAACTTATAC 

TTTGATATAACCCTGTGAAAACATCACCACAATCAAGATAGTGAATATAT 

TTATCACCTCCTGATACAGTTTAGCTCTGTGTCCCCACCTAAGTCTCATG 
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TCTCCTCCCTTCCTTCATTGCC^aa^^^CCTCi™^ 

AGC 5^ G ^^ GC 2Gm^GTGGTCCTCCCACTTCAGCCTCCTGAGT 
. . vjGGACTACAGGGGTACACCACCACAACTGGCTTAAAAAATTTTTTA 
'AAAAA7GGGGTCTTGTTATGTTTr^r a r^™i^^EIiII A 




CTCTGTCTGTTGCCCAGGCTGGAGTGCAGTGGTGCGATCTCGGCTCACCG 
CAAGCTCCACCTCCCGGGTTCAAGCAATTrTrrT^^^™S. G 



"AGC 




tcatgcaatctttttttattttaccatctctgtgaSccac 

TTCCAGCCT * CCAAAAATTTTTTTCTTTTTTTCTTAAAGATACTPrTrTr 

tgaotctcacsaactcttgaattgctact^^ 
gaatgcc^gggaattgcctgattgatcaaagaaatgtatS 

ATTCAAATATCCCTTTAATGTTAT^^^ 

gtacac^ttaggtgcaagagtgcatagctgttatttttttt^^ 
tgagactgttcatatatgcaagttatttaacagaa^^ 

tgagatgtcaggggggtctgatagagtacgtt^ggcagSctSS 

AAAAATAATGCCATTTCTGGTTTGTACTTrr^TA^™™^™ 



GOTCTGGGGGTTC3GAAGGACAGCCCTAGGACAGGCTTCK^ 
AUGTAGGGACTGCGAGGTTCTTGTTGAGTCTTTTTCATTCCTGGTCTTAC3 
A^TAGAATCCAA03CCTCTTC5AGAGTGGAAGGTGGGTTGGGAGGAGGG 
CAGATGGGGCTTAGGCCCAGGACACCCGTAGAGCTACTGCCCAGCTraTCT 

ACAGTACTGACAGAGGGAACCGTAGTATCGCACCCACTTCCTTCTCTTTC 

AATGAAAGTTTAAAGGTCACCATTTCCTCTGGCAAAGGAAG^CCACAAA 
TATTC C ATTTC CGGTCTTAGAAACAGCAAGGT& Tr & & tv \ a -rir«^» iT»w 
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C^GTGTGCTAAACCCGA JCTGCCACTTCCAAGGAGTAGATGAGGAATG * C 
CATGGTTCTGGGGAGCCCTACCCCAATTTGGGGCAGACATTCCAAAGCTC 
A^TTTCTGTGGAGGGGGTTGATGGTTAAAGGACGGCCTGGGAGTAACTCG 
-CTGTACTAGGGCCCAGGAGAGTTACATGCTGCTTCCCATGTTATTCATC 
ATTCCCCCATGTGAATAGCTATGGCGTGAGGTCCAAGGTTAGGGCCTTTC 
^ACCATAAATGGGGGAATAAAATTCCCCTACCAGCCTGAGAAGTTTCTGT 
?AJAAAGAGGC1TTTTT.TTTGCGGGGGTGGGGGAGCAAGCCGACTAATGT 
GTTATTTCCATACGGTTTGTTTTAAAATGTAGATGTCATATGCAGGAGAG 
GTGGTGTAGTGAGTCACAACGGGATTAGAAGGACCAGTCCGAAAAGCAGA 
AGAGGGTCAAGTTCAGGGCACTGAGGACTACTGCATTCAGTGGCGTGAAA 




^AAACTTGTGCCAGGCACCGTGCCAGGAGCTGGACTAAAAATTAAATCCA 

CCCCTGTGAGCTGCTCTGAAGGCTAAAATATGAAGTATGTAAAAGTAACC 

AAGTGCTGTACACATGCAGCTATTCAATGACTGTGTGGGCATTGCGGCAG 

ATTTTAATTTTCTTTTTTATTTCTTTCTCTTTAGTGAGAGCTGTTGGTTG 

TTATTATTGTCGTCGCTGTAACTGTCTATTTCACTTGCTTTTTTGTTGCC 

TCCAGCCCATTCCAGGGCTGTCATCTAAGACACTTCTTATCACCTAAATA 

ACCGGGGAGGCAAAGCGCTTTCTTAAGAGATGGATCCAGAAGAACAATGC 

^GGTTTTCTGTAGAAAAAGGGGCTGTGGGAAGT AGAGA TAAGAAGGGAAT 

^GGCCAAGATGAATGTACAGAGCCTTATTTTTTTTTTATAACACAGCAAG 

ATTAGATACAAAACAGGACAATAGCATCATCTGTTTTTATAACTGGAAAG 

3ACCTCACTTTACAGGTGGGGAAGAATAGAGTGGAGAAGTGAAGAGAATG 

GTCACAGAGTCAATCAGCATGTCTGCGTCAAAGCTGGGATTCCCAATTCA 

GGGC7CT7ACTACAGTGACGTATGGCTAATATTTTGGCATTGTTTCGGGG 

AAAAGCTGAAGCCCTGATGGTGTACGTCACTCTTGAGATAGTCTGTAGTC 

CAGCAGGGAGGAAAGCAAGGAAGGGAGGTGGAGGCAGCATTTTTGGGTGT 

AACATTTCGTTCTTGTTTTGTGGCCAAATCATAGTGTGATTGGGACAAGC 

CACTGCCTTTCTCTGAGCCTCCACTTTCTTTTTCTTCTTAAGAGGGAGGG 

AATAGTAGAGTAAAAGTAGTCATTTTATCAAACACCTGCTATTTTGGAGC 

CATATTGCAAGTGGGTTGGGGGTTGAACACTTGGCTTTATTACCCATAGG 

ATTAAATCCAACCTCGATACTGTGGCATTCCCAAACTCCAGTCTAATCTT 




ATTGGTTGAAATTCTCT^TTTTCCAGGGCCTTGCTTAAATATCATCTCATC 
CATTAAAACTTTCTTGAACCTCCCCTTGCCCTGTTCCTCCCTAATGTCTC 
AAGCCAGAATTTATTTCCTTTTGTGGCCAAGGGACrGGGTTTGTGACCTC 
"CTCACGAGACTTAATATTGAGACCAAACGTCTTTAGACCTCACCAGCCA 
SaGAGATGAGCATCTATGGAATGCAGGCTTTTGCuTGGACTTGCTGATGC 
AGGGCCTCTGCCTTCCTCCAGGGCCTCrCCTGCTGTTTTAGGAATTTCCC 
~CATGGCACAGTCCATGAGCTCAGGGTCAAGTTCATACATGi.i i - 1ACTT 
CTTCTACTCTGaU^TGGTCrTCTTGAACTCTGAGGGTCuTAAAGCTGCT 
CTGC^GTTTGTGGGGTGAGTAGAAAGGGGCTTTCAAAAGTTGTCKrrGTTC 
TTTCCCAC C C CAATAGCATGAAACACAAAGATGCTTACAAATAGCTGCCT 
TGCTTTCTAGTCCCAACTTCTCTCTCCTGAGGCTtTAAAACAAGTCCCCT 

aggttgagctggactggagttgtatcctatctrcattatctgtctactct 
ctttctgctctctagagaagatattatatatgtgtgtatgtatgtgtaaa 

t ATATAATATCCATATATAGAACATATATTGTTATAtTTACATATACATA 
CATAACATATGCATGTATTCATATATACATATGTAGTATCAAAGTTGGAA 
tta &ACTGTATATTrTGTAATTTGCTTTTATTTGCATuT ATCACTGTAAA 
ATGAATATTTATCCATACCGTAAGATATTCTTCAATGTATTITITrrrTT 
TTTGAAACAGGGTCTTGCTTrGTTGCCuT^uTGGAGTCCAATGACCCGA 

* • ^m^^^^^wf^n r*r»*nr<nnr>n.rTr'h LrtTrz^TPTTCCCACC 




rTTTAATTTTTTTTgTA' l'mTr ' l ' l 1 rA AACIACAGGGTTTTG CCACA TTG 
: CC^GCTGGTCTTGAGCTCCTGGGTCCAACK»ATCCTCCCACTTTGGCC 
~CCCAAAGTGCTAAGATTACAAGCATGAGCCACCACACCTGGCCTCAATG 
T&&TTTTTAATGGCTGTATAGTATTCC1ATCATGTGGTTGT ACCCA AAATT 
^T^AACCAGTCCCuAGTTTATtTuTIATTTTTtTTTAuTATTtTGAATAA 
-GT^^AGTAAATACCCAC^AATATGTAC^TGGCTGGGuTTAGTGGCT 
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AGGTCAGGAGTTCGAC^CCATCTTGGTTAACATGGTC^^cS^ 





TCAGCCTCC7GAGTAGCTGGCACTACAGGCGTGTACCATCACGCCCGGCT 

AATTTTTTTTTGTATTTTTAGTAGAGACGGGGTTTCACCMGTTGGCCAG 
GCTGGTCTCGAACTCCTAACCTTf5Tr:ATf^ n rrrrr^^A^^r?r 



TTAACTCCAGGAGGCCTGGTATTCAGAGGGATTAGGGCTGG^GAAGGGC 
CTCAAAGCTTTCAAGGCCTC^AATAGGrTr:r a r.rrTAj?^^^r. 



TAATATACAGTAGACTTGTGTTAAATATTTAACTGATTGTAAAAGGA^AA 
AACCAGACGCAGTTTTCCCTACCATACTG 




^^^.^ui lAouui lA^ATCCCACAGGTTGAGGGCTCAGTCTCA 
CAAGACTGCCTCCCACTTCAGGTGCCAGTTACAAGTGGTAGGTTGTCACC 
TATGCTTCTGACTGATGGCTATAAATCTGGGTTTGCTTCCCTCGGGTTCC 
GTGAATTTGCTAGAGCAGCTCACAGAACTCAGGAAAACACTTAAGTTTAC 
CAGTTTATTCTAAAAGATATTACAAAGGATACAGATGAACACCAGATGAA 
GAGATGCGCAGAGCAAAGCATGTGAGAAGGGGTGTGGAGCTTCCATGCCC 
CTCTGGGGCACCACCCTCCAGGAACCTTCATGTGTCCAGCTATCTGGGAG 
CCCTTCCAAACCCTGTCCTTTTTGGGTTTTTAAGAGTGGCTTTATTACAT 




- ~ ~ * ^ ww * 4 x \j wjvajwj i i AAuAGTCTCAAGTC 

:S^IEH3SHI TGGTC " TCCTGT ^ CAAA CCCCATCATGAAGCTACT 
^v.ATTo^C - GCCAGCCAGCAGTCATCTATTAGCATGCAAAAGACACTC 
TTATTATTCCAGAGAT7CCAAGGGTTTTTAAAAGCTGTATGTCAGGAAAC 
AGGAGATGAAGAACAAATATATATTTCACAACATCACACTCGTTGGGGGA 

ATTGACAGGATAGCAAAACTGATTAAAGGAGGATAGGAGAGACTGAGATA 
TATATTTCCATATATATATATAGACAfiaf;ar!arti«ii<p»T^-r^^»^i>-i.«m. 




>ConcigSl 

ACACATTTGGGGGAGCAGTTCCGGAGGTACAGCCCGGACAGGAGATGTGA 

GAAGATCGTGGTTANTGTrCCCCTGGTCCAGAACCCCTCCAAGTGGGCTT 

AAGTAGGAAGGGTGGTGAGCGGCAGGTAAACACACGTCAAAGGCAGTCTT 

CCTCTCTGAGGGAAAACACTTGTATAAGCATTGCAATCAATGGGCCTCTT 

TAATTATGTGCCAGTGGCAAGAGCGGGTGCTGAACCCAGGGGCCTGCCTC 

AATCCGGGGCCTTTGAGGCAGAATAAAGTGGTCTCAGGTTGTTGGCATTT 

CCTTGCCCTTCCACCCGAAGCAGACACAAATCCTCTCTGGAGGCAAGTTC 

CCCAATTCAGCCAGTACAACTCCCACAGACTAAGATCAATCATGTACAAG 
CTCACAGACAAAGGTCACCaaararirairn^r'n »>r» * »/»« » 




* w™_w * w«n 4 «n«nn **wvwilj«AAUAftTAACCACCAGCTGGGATGCTCT 

AAGTCTTCAGCTGTTAGAATTCCTGAATATAGAATAAAACTGCCACAATG 

3CAAACATGCATCTAGTACTTACTGTGTGCTGGGTTCTAAGAATTTTGCA 
ClTTGTGmr.iTirrnirTPsivwrfi /-■» /—»»■.» ^.«.« 
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AGTCACAGGTAAGTAGv TTCTCACAGx ^ rGGAGTTAAAGGCATGGGAv. * 

GAGACGAGCAAGGTTCCTAAAGGGACAGTGGCCAGTAAATGACCAGGGGC 

TACTGGAGTGGCTGCATGGCTCTGTGGAAGCTCAGAGGAGCCTTGGG7CC 

TGCAGGTGCAGTAGCAGCTTTCTGTAGTTCCTGATCTCTGGGTCCCACAA 

TCTTCCCCGTTTTTGCTCCTCCACTTCTAATTT7GTAACTGACTTCCCTG 

TGTGTACTTCTCTCTCTGATTGAAATAGCCAGACTGGTTTCTGTTTCCTG 

ATAAGACATTGTCTGGTACGAACACAGTAACTCATTTAATCCGATATCTC 

TrfTGAAGGAGGTACAATAATTATTCCTATTTTACAGATGAGGAAACACAG 

CAGAAAAATAAAGTCAATTGTCTAAGGTTGCACATTTAGTCAAGGGAAGG 

GTTGATATAACATATAATTATTTAGAAAACATCTAAGGAAATAAAAGGCA 

TAATTTAAAAATAAAACTAGGCAGGTTTAAAAAAATGAAGTAATCTATAA 

GTAAAAAAGTATAATTGTTGAAATACATATCTTAGTGGATGGGTTAAATA 

GCTGAAGAAATGATTAATGAACTGGAAGGTAGTTCTGAGGAAATCAGAAT 

TCAGCATAGATAGAAAAAATGGGAATTTACAAAAGTACACAGGAATTATA 

AAAGAGGTTAAATTATAGGGAGGGTAGAATGAGAATTAACATTGGTCTAA 

CTGGAATTTTGGAAGAAGAGAATAGAGAGAATGAACAAGGCAATATTTAA 

AGAGGTGGCTGAGAATTTTTCAGAACCAACACAAACTATGACTTTACCAG 




AAAAAAAGAAAGTCAGACTTAGAAAGAAATGACAATGGCAGACTACTCAA 

C AACAACAATGGAAAC CAAATTCAGTGAAACAGTATTTTCAAAAT.GCATA 

^TTAATCTATCTTTGAAGAATAAGGGTGAAAAGGGTGAAAATTGCTGCCT 

^ATACAAAATATCAACATTAACAAAAAGTAATGAAGGTAATATAAAAATG 

TTTTCAAATAAACAAAACTGAGAGAGTTTACCACCAACAAGCATTCATTA 

AATGGACTTTTAAATGCAGTTTTTAGGAAGAAGGAAAACAATTCCTAAGG 

AAGGTCTGAGATGCAAAAAGGAATTATGAACAAAGAAATTGTTAAAATTA 

TAGGTGAATTAAAAAAACTGCCTGCATAAATGATAATAATGACAATGATG 

CTATTAATAATGAGTTGATAAGGATAAAGAAAAGGACAGAATTAAAATAC 

TAGAAAACAAGCATGCTGGAAAGGATTCAGGAATTACTTGAAGGTTAAAG 

TTCTAGGGTCCTTCTATCCTTCTAGAGGGGAGTCAATATATTAATTTTTG 

ACCGTCACTTACACAGTGAAAAACTTTAAGGATAACCATAAAAAAATAGA 

AATAGAGAGTATAACTTCTGAAACAGTCAAGGGAAAAATATGGAATAAGA 

AAACTGACCAAAAAACATCTCAGTCAATCAAAAAAAAAAAAAAAAAGAAA 

GAAAAGGTTCGGAAGGAGAAAATCAAAGCATAGAAAAAGCGGGACAAATA 

GAAGTGGAAAAGAAAAAGGTAGAAGAAACAGGTCCAGAAATATCACTGAT 




■ - 3TGAGTCAACTTGTGATGATGAAAGGTTTAATTCACCAGAAAGAC 
^CAACTATAAACTTGTAATCAAATAGTTTTATTTTATTTACTTTATTTAT 

-""AT" T-GAGACAGGATCTTGTTCTGTTGCTCAGGCTGGAGTGCAG 

-GGC'TGATCTCAGCTCACTGCAGCCTCCACCTCTTGAGGCTCAAGCTTT 

CTTCCTGCCTTAGCCTCATGAGTAGCTGGGTCCACAGGCACACACCACCA 

AGCCCTGCTAATTTTTGTATTTTTTGTAGAGATGGGGTTTQ^CCATGTTA 

CCAGGCTGGTCTCAAACTCCTGGGCTCAAGCGATCTGCCCCCCTCGGCTT 

CCCAAAGTGTTGGGATTATAGGCGTGAGCCACGGTGCCTGGCCTCAAATA 

ACTATTTAAGTGAAACAAAACTAGTATGGCArTAATGAAAAATGTATAAA 

TCCATAATCGCAGAGGGATTTCAACTTACTTCTTTCGATTATGTAAAGGT 

CAAACAGACAAAAGACAATGACAAAACTTAATGCAATGAACACTTTTGAT 

TTAATGAACATATATTGGATATGTACCCAAGAATTAGAGAATACATACTA 

GTTTTGAGTTTATGCAGAACATTTACAAAAATTTAGTGGAAGCCTAAATT 

ATAAAAAGTTGCTGTCACGTAGAATAACACACAAACCCCTGAGTCCGGAA 

TTCAAAGCCCTCCACACTCTCCTCTACCTTTGCATCTTTATCCTCCACCA 

CACTGCAGTGCATACTCTGGGCTACTACTCACTGTTCTTGATTCAAATTC 

CATGTTCTGTCAGCTCAAATCATTCTCTCTGCCTGGAATAACTACTTCAT 

ACATATTCTGCTATTGAATTCTTGTCTTAGCACCCCATCTACTCCAAGAC 

GATGTCCAGTTGGGGTTACTCCCTGTCCCATTrTCTTTGATTACACTTTT 

^ TTrT , r , TACTrcCATTAXATrATTGATC ACATCTGTGCCACAGl , TTTTGA 

A^ G ~ GT cTGCTTTTACTCTTTTCTAGACCCTGAGAGCTCCTGAAGGGT 

^GGGTCATTTCTTTTTTATTTGCTCATTCCTCATGGCACAGTGAGTGCTT 

AATAAATGGCTATTGACTGAAATTAAACTGTATCTAAATGGACATATTCC 
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TGAGGGCCAAAAAAGAGCAGGGAAAGGTGCAAAGACAAAATC1TCCATTT 
TTAAACAATGTAAGAATGTGGTCCACCTCATGCTCAGGTGGGACT T 'TATC 
ATGACGTTATTTTTGGGGACTTATAGCTGCATCATTTACCCCATATACAT 
TTACCTTTAGTGTAGGGAACTGAGGACAGGAATTTTGTTGATGCAGACTC 

TTGTTTTACATTTCTTCATTAATACTTTAGTTGGTGGTTTAGCTTTAGTT 




:.:.^^l l mm UAft ^ 1 A <- 1 A l r ix: TGGCAATAATCAAACAGCAT 
GGCCATTTGTTTTGTAAGGCCTTTCC7AAAATATGACGGTAAAATCTACG 
TGTGGAAAAATGCTTATTCTTCTGTCCTCTATAAATGTGAATCTAGTTTG 
TCTTCAAAATGAAATCAAGTGATTAAAATGTAGTTTTCTAAGAAGATAAA 
TGGAGCAAAGCACTCTGTGTTTCACAGTGTTGGAAATCACTCATCCCTCA 
TAAAACTGTCCCAACTGATCCTGACTCACATGAATGAATTAAAATAAGAG 
TTAATAACATCAATTTACATTTTTAAAGACACTTTCCCATGXTTTAGACT 




iAwxw^wA^wiiwixi iAi 1 1 l^aau^UATATCTAATTTTGTTGCA 

GAATGGTCTGAATTCTACAAAAATGTTGAGTTGTGTAGTGTGGAGAAGTA 

CGGAGCCATTTACTGAAAGGCTGGGGGGAAATGACGAGACCCTGAGATAA 

GGCAGTAGTGGTGCGAACAGAGTGGAAGGGAGGTAGTTGAGATATGTTCA 
GAGTAGAATCAGAATGGAr a ti_ rvrrz aaraa n*rnn !\Trr^ ~ 




wwvj x i va^avan i 1 1 u-ruAUACTTCTAGGTTGATCCACTGAA 
GTTACATTATTCAACACCACAAGGAAACTAGGGGAATGAGAAGGCATACT 
GGTTTGCTTTGGAGTGGAAGGGCAGTGATGTAAGAGGAGTTAATGAGTTA 




* iu^ivjnu^iuLA^ju^iWji i UAALAlu TGTATTTCTTAGTAACT 

GATAGGC^TCAC^GACTCACATC^GTAAGGAAGCAACAGCAAACTTGATT 

GGACGATATACCTGGAACTCAGTACCCTATGACTGGAGCAAGTCTCTGTC 

AGTGAAATGAGGATAAGAAGAATCTTGACCTTGTGGAATATGTTGTTAGG 

AATATATGTGATGAACAACATAGGATACTTCCTACAGGGCTCCACATGTA 

GTAAGGGCTTTATAAATGCTTGATAAATATTATTG7TGTAATTTATTTCC 

AAAGTAAGATGCCACTGGAGGAATCTTTGGAACCCAAATTAATAACAAAT 

AGGACTGGATGCAATGGCTCACACCTGTAATCCCAGCACTTTGGAAGGCC 

AAGGCAGGAGGATCTCTTGAGCCCAGAAATTCAAGACCAGCCTGGGTGAC 
ACAGGGAGA rCTTGTATrTZi TO. A A r: A rtts * a * a a * * -w* * ✓---./-. *m /-»-*,-. 




w * -w* * www * w.w-w--tw j. unuAV, w w 1 A 1 V. * Uwi 1 AAA 1 AAA 1 AA 

ATAAATAAATAAATAAGTACAAACCAGCAAACACTAATCCTTTCTAGAG^ 
TTATTGAACTCTGGAGGGCAGATCTGAATGGAGCCAGCAGAGGGACCTAT 
GGAGATCAGCCTGGCCCTGGACAGCACCAGGCAATGGGGTTGCTAGAGAG 
GTAAT(SW3GTTGAACAGGGTT^^ 

GACTCAGACTAAI 1 1 1 1 l l'lTlTlTGCATGAGGATTAGGTGTTCCTAGGA 
ATTTCAATGAGAGCAGGGTTAATGAAGGAA^^ 

G<jAAGGCATCTGAGAGAGCCTGGCTTATGAATGGCTGCGTCAGTATG^ 

CACCTGCTTTCCTTGTATCTACTTAGCAGATGATCCCACCCCAGGCCTCC 

AGGGCCAAGGTCATTTCCACATAGTCATGGGCCCTTGAGGGCCTG<3AGCA 

GTGTAAGGAAGACAGAGTCTTAAGAAATTGCATTAACAGTCATGGTGCTT 

GGCAAGTGTCGTCATCCTATGCGUVGCCTGATCTGAAGGGGTGCATGCTC 

ATAGGTAGCTGCTGCCCAAGATTACAGCAGCTTCTTCAATCCCAGATCCA 

TGCTCTCCTATATTCATTTTTCCAGGGCTTCCTGTCCTTCGACAGTGATG 

AGATGCAGAATGACTTATTGAGTTATTCTCCTGATAGTTGCCAACTTTTC 

CAAATGACAATGGGGCATGGAGCTTGAGAGTGGAAATGAGGCCCTAGGGA 

TAGCGTGCTTAGGAAAACACTCCCAGCCTGATGTAATTCTGGGGGTACAA 

TGGCATTTTCATCATCAAGACTGATGTAAAGGGTGACTAGCAGTGAGTTG 

GGGGTGACTCGCACTGGGGCTAGGTTTCTGATTCTGCCTAATCCAGACAG 

AGCAGAAGCACTAGTGGGCTGGTAGAGGGCCTCCAGGGCCTCACTTAATG 
______ _ r ,- w _^, fT , r ,- r ,_ _ 
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TGATGCCTGCCCTGCCATTCCTGCGTGTGATGTCTCTGGGGCATCTTGCC 




3TGGGGTGGAAACKjTAGAQAi<A(jAtiAA<-AiAiAV3v-vji x x x«_v-x - ^ 
GTGTGGGCATGTCATAGAGGAAATACCCAATTCCTGAGCCTTGAGCCCTC 
-AGGAAACCrTGGAATATTAGGTTAGTCATCCCCAAGGAAGTCTAAGAAT 
-CTGGTCTCACCCATCTCCTTTAATTCCCACAATGATCCTACATGATATT 
AAGGAACACGGGCCAGTAACCCTCCAAGCAATGGATGTGGTGGTGAAGTT 




^G T TTCAAACTGTTCTCC^CTCAGCGTTATTAAAJjUA iaui uai. 
ACATAAAAATTGTATATGTCTACGGTGTACAATGTGATGTTTCGATCTAT 
3TATACATTGTGAAATGATTACAACAAGCTAAATAACATACCCATTCATC 
GTGTTTCAAAGGAATTAAACTCAAGCACAAAAGAGAGGTGCTGTTGAAGA 
GTAGGGCTGCTCTATCTAAGTAGTATGTCTGGGGTTGTCCTGGATCAGGG 
TCCTTTTGTGCTAGTAATAAACCAGCCCTTCTGGGGCTGCTC€ACTTTCC 
CCACATTTTCTTCTGGAGCCTCCCTAAGAATTAGGACATGGCCACTTTCT 
„ w „„. -r » r^.'^-rrr-rarTTrAa.eAAGGACAGGGCTTGTGCTGCCCCATGC 




ACTTGAGTGTCC wTAtAGtAuftvjAVj*- x J- u^-»<-fv_ * x - — — 
!AAATCCCCCAGATTAATCTTGGTTCTAAGCATCATGGCTGTATTTCACA 
. » t-t<o ,-.•» » R«r«r»r''Rr!/-&Tftr;Trr:aaT&aR<5fi.TTTTTGTGCTA 




-AGGACT^"TATTAGTTTTAAAAAATTATACAAGCTTAGAGTAAGAAAT 
TAAACAGTGCAAAAGAATTCACTGTGAAAAGTAAAATGCTCTG^CTGC 
TGAGAGACAGATATTGCAGCCCAGATACTACTGGGG7CAATAGTTTCCTT 
TAAGCATGCCATTTTGATGGTTTATGGGACTTACAGCTCAAGAAGCTTGA 
rwTTnsTT'prir.i & & ATr &TTGTTGCAGGTATTAGATATGACCG 



TCTCATAAAGATACACACACAGAWCAW^iT^UAiAi j. v«w * ™, 
CTTATGGGCTGCTTGTCCTTTCTGCTCTGTGCCTAAGTTGGGCTCAGAGT 
AGCCTGGCATCGGCTGTGGGGAGAATGCTGGCATGGGQTTAGCAGGAGCC 
^-f-r.» »r-»TryrrrTiarv , rarOTfMAAt3AGTCCTTCAAGGAGACCAGAC 



AGCCTGGCATCGGCTGTGGGGAGAATGCTGQCAit^*«i. iiwu*wiw,y. 
CACTTAACATGTCCTAAGCCACCTGGAaGAGTCCTTCAAGGAGACCAGAC 
TCCAGAGGCCCTAAGGAAGGAAGGACTrTTGCCCGTTTTTAGGTATTCTA 
GTCCCAGAGTTTAGGGAGGAATGGTTTGGCTTTGGGTCGTGTGC^CTTT 
ACCGAGTGGGATGGGATGTGCCaVTGAGCTGTTGAGCTG^TCTTGGAGA 
AGACAGCAAAAGCGGGAATAAGAGGTCACXSAAGCTGTGTGGTTGTAG^ 
. . r.r.» /-•* rsr^r'r^rMr^viTraaaACyrGGTCATGGTAGTGACGGTGG 




AGGCTGAGGTGGTAUAAAA x u«jauwauuu»wv.wwi x --------- 

"CTGACCGAGCTCCTATGCTCTCCTGGTTCATTTTAGGCTCTGTAGCAGC 

AGATGATTGGCTGGTGTGAGAGCAGTGCACCTGCCATATCAGGCAATCCA 

AGACAAGTCCAAGCTACGCTGGGAGGAAACCTGAAGGCAGCAGCAGGTAG 

ACTGGCTGaAGACAGACAGGCAGGCAACTTGTCAATCAGATTTGTCTTTT 

TAAGGACTTITaACTGGGGAGCCCTCCATGACAGATCAGATGAGAGAGGA 

A^TGGGTCCGCCCATGTGTCAAGCTACCAGA(^TCCCATC(3GTGCTTG 

" ^^t,--.--.^* » r!r^«MTrrr:aflGTTTGCAGGTAGAGGGTGAGCTGGT 




CAGAGGGACCTATTOvjWiAi»«- iaw.uu»vj»v.** x II^^m 
^GCACCCCACCGCCXXJCAGCXrGaK^GGCACTTCTCCTxTOC^CCA 

GGACCTCACAGAGGCTGATCTGGCTCTGTGAGGTGGGAAAATGGGT^GTA 




GAAGGGGCTCTCCATCCTCTGTCTGCCTCTAGCAAOT^^iwx^^ 
^CTGGGCAAGACACAGGGGGAAATGCCATCTGTTATCCAaATATATTTCA 

ATGTGACAGGAAAGCTGTCTTTAGAGCACAGC 



>Concig52 
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v-TwC TCTTTCTTTCTCTTTCTTTCTTTCTTTCTTTCTTTCTTTC^ 

--~ T - i CTACTTTCTTTCTTCTCTTTCCTTCCTTCCATCTTTCT 




ACCAAACCTGGCTAATTTTTGTATTTTTAGTAGAGATGGGGTTTCACCAC 

ATTGGCCAGGCTGGTCTTGAACTCCTGACCTCAGGTGATCTGCCTGCCTT 

GGCCTTCTGAAGTGCTGGGATTACAGGCTGGGCCTCTACGCGCGGCCGAG 

ACTACCTCTCTTTTAACTGGATCTCTGAGCTCTGGGCAGAGCCCACCCTG 

AATCCTGGTCTCCAAAAAGGGAAAATTATTAGGAGGCTAGACCATATGAT 

GCTTTTACAGTGCACTTAAAAAAAAGTTTGTT7TTTTTTTAAAAGACATT 
TrTaraTrtTPTa & & rr» r»i Tw^f**^ •» > . « . 




* x U w«.i nU v. i/xv, i u«u««i ai AATAGTCAAAAAAATCAGGTAACACAA 

CACAAACGCAAGCAGTTTAAGAGCTGAAATGAACTTGTCTG7TTACACTC 

TAGGGATTCCATAAGGAAAAATAGAAGTTTCTCCCTAAAAGGGAGCCTGG 

CACCTTCTCCATTTTCTTTAAGGAACCCCAGGCTATTATAAACTATTTTA 

GGGCTCTCATGCAGCAGACGGTGCAAGAGAAAGGAGAGACAGCAGAAGTA 
AATGAAGAAAACAGAATCfirJTraariftifta » » » » » » ~mm~~.~~~.~. 




^uwww^uvjs^avj i AUUAAAGAAAAAAAAAACATGAGGGCTATTTJ 

ATACAAAGACGCATACATACACATGCACACATCTTGGATGTTAGCTTTTA 

ATTAAGCTGACTTTTAACTATTGAGGTCCTTTAAAATAAATCTTTTAAAA 

TCTTATTACGATATTTCAGCTAGGACAAATTGCTGCTATTTCAGCATTAC 

CAAGTATCAAACCAGAAAAGGCTTGATTTAGGAACCAAACCCAGGCTGTC 

GTGGTAGGAAAAAAGGCAGAACGTTAGCTATGGAACCCACAGCATGGGGC 

AACAGCCATTGCTCTTTCAGTATGGCCTGGCTAGCAAAAAGGTGGCCTTG 

TTATGTAAATA^GCCCGTTTGGTGGTCAAAATGAAACATCTTTTCCTTT 

TTTTTTTTCTTTTGCTGGCCGTTTTTTCCCCCACCATACCACGTTTGTGT 

GTGTGGGAGGGTGGGAATTTAGCCACTTCAGAGGCCTCATTCCCCATAAT 
TTGGAAATTTCCTTTGGATTTdTr a a r.rn >/?iT«r» /-"n* ^/-.-t.^. * 




- w w urramsnv. j. wwmwvs\_AA 1 A/UtAAUAIjAAACAAACAGTTAAGC 

AAAATGAATGATCACACAACTTATATGATTACTGAGTGCTCTAATGGTAA 




«ai a wiuciA^-i iAUOaTTQGGTCTCAGGCTGAAGACCGCTCACTA 
CCGTTCTAGAAGCAAGAAATAAAACTTGAACTCGTCTTACCTGTGTAGCA 
GGACAAGCCGCAGACAAAATCCCTCAGACACCAAATTAAAGAAGGAAGGG 
CTTTATTGGGCCTGGAGCTGCGGCAAGACTCACGTCTCCAACAACCGAGC 
TCCCCGAGTGTGCAATTCCTGTCCCTTTTAAGGGCTCACAACTCTAAGGC 
GGTCCACATGAGAGAGTCGTGATAGATTGAGCAAGCAGGGGGTATGTGAC 
TGGGGGCTGCATGCACCTGTAGTTAGAATGGAACAGAACATGACAGGGAT 
CTTCACAGTGCTTTTCTTATGCAAATAACCGATTAGATCAGGGGTCGATC 
TTTACCAGGCCCAGGGTGTGTCACCGGGCTGTCTGCTTGTGGATTTCATT 
TCTGCCTTTTAGTTATTACTTCTTTCTTTGGAGGCAGAAATTGGGCATAA 
GACAATATGAGGGGTGGTCTCCTCTCTTACCTGCGGGGAGTGAGCTCAAA 
CTCCTTAAAGGAGTTACCTGCCTTCCATCATCAGGGAAGCAGGAAATCTT 
GCCTTCCTTGTTGGAAGCAAGTAAAACTCAAAACAAACAAAGAAAAAAAC 
AGG^GTTGTACAGCAAAATAAACTTITGATTTTGACCAAATTTTGGGAG 
ATCAGGAATTCTCTGAAGGAGATGCTTTCAGACCTCAGCAAATTGTCCTG 
TTGGTTTGAGCCATAAAGTTAGCTCATGCTGGTACCAAACACCAGTAGGA 




<f?///r 
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ACAGGTAGGTGAGAGACAAACAATTGCATTC7TCTGAGTGTCTGATTAGC 
CTT^CCAAAGGAGGCAATCAGATATGCATTTATCACAGTGAGCAGAGGGG 
TGACTTTGAATAGAATGGGAGGCAGGTTTGCCCTAAGCAGTTCCCAGCTT 
GACTrrTCCCTTTAGCTTAGTGATTTGGAGGCCCCAAGATTTATTTTCCT 
~^ACATCAC7GTGGGCAGCTGACTAGGAAAGCTTTGTAGGACTGGTGGG 
CAGTG7GAGAGCCCAGTGGGGGGTGGTGGTCCTGTGCCAATGGTAGCAAC 
CACC""G7GAGGC7GAGTAAACTCATTTCCCAACCTCCTCTAGCAGCCCCA 




GTCTGGi <oAL» i Li lAAlVJWUiliUUMUHUUVjnnuw j. «w * — 

CAAGAAAATGGTTGAAGAGATGGGGCACAGAAATTAAGCTGGATCAAAAA 
GGAC GGAAAAGCAGAAAGGGCCGATAGAGAGAGGGGAT ATCTATGGGTTC 
GCGATTCTGAAAAGGACAAATCACTGGTGCTTTGAGAAGAGAGAGGGTGA 
GAAAGCAGGAAGGCTGGAGGCTGTCATCCAAGAGGCGGACATCTGTGAAC 
. «^»/-m^»^*/"x/^r»Trv?rirtrtTrir:rr , iAaGGGAGTGCCTCT 




"CTCACCTCCTACTwTTiUU, itl-i ilJlJ4Ll.u»«uftinnifwiw* * 

AGAGAAGTACCCATA77TAATTCATC7G7GTCTTCCTAGCAGTACTAAAA 

ATATTATATGAAAGGTATCAAACCTTTGAGAATGTGTGCTGCTAAATTGT 

TAAGGATGCTGGAAAACTCAAGACGTCCCTGATCCTGAGCCTGAGTATGA 

GCCTGTGG7GAGCCCAATGCAGGTCTCCATTCAGACAAAGGCCTCAGGGA 

ACGGATGAGACCTAGGGACAGAGATGCATGCTGGAGCAGCATTCCCCATC 

C^AC"GCAGCTCAGGCCAGCTGACTGCTTTATGAGTAAACGTTACCAGG 

GAACACTTTGCAGTCTTAACACACATGCCCACCTGTGACCACTGATCCCT 

GTTGGGTGACCACTGACATCAGAGATTCGATGGCAGCAATGAAGACAAGG 

CTATCCTCATTAGGAAGGAAAGGAAGGAGGAGGGAGGAGGGCAAACGAAT 

CTTTCCTGCTTGTCAACCACGTCCATCTCTGTTAGGTGATTTCCCATGTG 

TGACTTTGTTTATCTTTATAATAACTCTGAGAGGTAGGTCTTGATGTCCA 

CATTTTGAACATGAGGACATCCAGCCAGGAAGTTGAGTTCTGGGGACATA 

GCTGAGAGGGCAAAGCTACATATAAACCCCTCTTTGTTTTTTCTGGCTTA 

TCCACTGAGTGCCCCCTGCAATCCACCAGCCCATTTGTGAAGTGCATACT 

ATAGGTAAGTTGGCACAGGAGGAGTGGATGTGGGCGATTTTGTCACAGCT 

CTCCAGGAACTTACACACTGGTCIAGGAGGGCCAGGTATGTTCCTGACCAG 

TCACAATCAAAGCAACCTCCTACTAATCAGGGAGGCTTGGTACCTGGGGA 

ATGCTATGTTGAAAGGTTCTTTTCrGGGTTTTAAAATGATGGGTCTATTT 

CCTTATTCTTAAGATTGCTTTTTrrCTGGCTAGAACTTAAAAGAAATTTT 




GGTGGCC7TGATACT. I I AAAAi hi lOi-v- J. j.« *««w»+>^-~- 

AGAAGGAAGTCAAAGAACATGCTAGATTTCACAAAGGTTAATTCCTTGAA 

ATC^AGTTATCTACAGGACAATGTTGTCAAAGAA 

GCACGGCGGCTCATGCCTATAATCCCAGCACTTTGGGAGGCTGAGGCAGG 

TGGATCACCTGAGGTCAGGAGTTCGAGACCAGCCTGGCCAACATGGTGAA 

ACCCCATCTCTACTAAAAATACAAAAAAAATTAGCCAGGTGTGCTGGTGG 

GCACCTGTAATCCCAGCTACACGGGAGGCTGAGGCAGGAGAATCGCTTGA 

ACCCGGGAGGAGGAAGTTGCAGTGAGCCAAGTTCAAGCCACTGCACCCCA 

GCCTGGGG^ACAGAGCAAGACTTTGTCTCCAAAAAAAAAAAAAATTCAAT 




AGAT AT AGGAAACACTGCAATGGGAiTriri vjuovj i\j<jutjw»*«w»»r» * 
A^CAACTACATATACAGCACGGGCAAGGACATATTCATAGCCAGG 
AGAGCAAAGATCAGTCSGATCKIGAAATTACTAACJAGGAAACATGAAAAATA 
AGGGAGCTTCTGCCTAAACCCACCTAACCGGATCCTTGCTGAAGACAGGA 



GAGATTTCAAGGGTQATCAUAi ai iwuiiiwniww * * ^ " 
AG^GTTTACAACSAAAGTGTACAAATGTGCCTGGGAGAAGGTTCAGGAGC 

CTGACTAAAATTTGGTCAAGCAGAGAATATTTGCCAAGATAATAGCTAAG 

T^TTCTG^CAAACAATAGATGCTAAGCCAGCAAGGGTGATGTGCT 

AAA^ACTGAGGGCTTATTTCCTTTTCCCCCAATCTCCACTCAGTCAAGT 
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TTTCAGACAGTGAGAACAAAATTAACAGAGTCATCTGC^^GGGTGGCT 
TCATGCv. . GTAATCCCAGCACT7TGGGAGGCCGAGGCGGGCGGATCACAA 

ggtgaggagatcaagaccatcctggctaacgcagtcaScg^tctc^ 

CTAAAAATACAAAATATTAGCCAGGCGTGGTGGCGGGCACrTOTartTrrr 



ACTACAGCCTGGAACTCCTGAGCTCAAGCAATTCTCCTACCTTGGCCTAC 

GACACATCCAACX3ATCAGGCAGAAAGCCTGTGCGGAGTGGGATGAGCAAA 
^5f^2SIS^ G ^ CT ^ GAG ^^ TGCAG TGCCAGCTAGGACAG 
GCCTTTTTGGGCTATGGGAGGTTTTCAGAGGAGACCCCACCTAAACTAAC 
CCATAACATTGCAGTGGGGACCTGTTGAAGTCATGGACTACTACC^TGAAA 

GGTATCTTGCCACCAAiT a c n t<v* »r» r^r*/*£*<X~^?~Z Z - 



TZ. ZZ: ^ Ull -^ 1,J11 utccAaGCTGGAGCACAGTGCAGTCGTGCAATC 
ATATTAGATTGGTGCAAAAGTAATCACGGTTTTTGTCATTAAAAGTTTTG 
CCATTACTTTTAATGATAAAAACCACGATTACTTTTGACGCAACTTAAAA 
GCTCACTGCAGCCTCAAAATTCCTGGTCTCAGGGAATCCTCCTGCCTCAG 
CTTC CTG AATAGCTGGGACTACAGGCACATGCAATC CTAC CTGGCTAATT 
TTTTAAAAAT Till 1 1TUTAAAGATAGAAACTCATTTTGTTGTCCAGGCT 
GGTTrCAAACTCTTGTCTTTGTGCCTCCCTCTGCCCTGTGCAAGACCTTC 
TGGATGCCCACTAATGAAGACTTCCAGGGAGAGGAAAAGTAAACATAGGT 
CCCTGATCAAGGGACCAGGGTTTATCGACCACAAACAGCATGCCCAGATT 
CCACTGGCAGTCCTAGAGGTCGCATTTGCCCCAAGTGTGTGTGGAAGGCC 
TCTCCCTAGa^GTTGGTTTATACACCAGCCACAGaVCAGCATATTCTCTT 
AAATTGTGAACATTTGCAAAAACTCCTTGAGGACAACTATCATGTCTTGT 

GTACTTTTGTTTTGTTTCCCTTCCCCTATGTACACGCGCGCGCATGCACT 
CATGCACGCACGCGCGCGCGCarararar'&^ii^* o» « « . 
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AGGTGTCCACTG7GCCTTTGTGGCAGGAACTGCGCTTCTCTACTCTCCCA 

CTTTGAGGCCTCTGGGGCTGGCCTGCTGCCTCCTCATTGACAAGGCTGCT 

^ACTGAGCAGTTCATTCTGAGCTGGACATAGTGCTTCTGGTGAGTCTCTA 

C-TC-AT7TAACCCAAAGATATTCrrTCCTAAGGAAACGCTTTCCTGTCG 

GGGGAGGT7AGCTC CAGATGGAAGTC ACAAGT GATGGCATGGTAGCTCT C 

ATCCGTTTGGGTGGATGATATTCACGGAGCACCACCATGAGCCAGTCATG 

GAGGTGAACAGTATATGCCAGCCCTGAATCAGGTGCATTGACAGCAAGGG 

AGACAAGCAAACAAAGCTGAGGTTTGCTGAGGATGTTCAAGACTCACACA 

GCACAGAGGAGCATCCACCACCCAGCTTGGGAAAGGACTTGTTATAGAGG 

GGGTGAAGCATGAGCTGAGTCTTGAAAGACTAGAAATTAGCCAAACTACA 

AGGAGGAGAAGGAGTTTCCAGTCAGGAAGAACAGGTTATGCAAAAGCACA 

GAGACTAGAAAGAATATCACATTCAAGGAACTGCAAATAGACAGGAAAGA 

TTGATGC GTGGGATAGGAGAGGAGGGCAGGGGATTCCAGGTGGGCCCTGC 

TTGCCACACTCAGGAGCTTGAACTTATCCACAAAGGAGGTGTGGAACCAG 

T AATGAATGGGTTTTGTGCAAGGGCTTCATGTCACCAGATTTGCTTTTTG 

GAGATACTTCTGTGGCTGATATGTGAGGAAGGGATGGAGGAftGTTTCCGT 

GGCAATCAGGAAAACCAATTAGCAGATGATTCAAATGGCCTAGGGGAAAA 

3GGAGGAGGACTTGGACTACCATGCAGCAGCAGAAATGGAGAGAAATAAC 

AGATCCCAGGCACTCAGGAAGCGCTCAGAATGAGCCCTTCAAAGAACTTA 

T GGTAGGTGATGGATGGATGGAGTGTGAGTCCTGGGATAGCATTGCCTGG 

GAAAATACTTTCTAGTTGAGACAGGGAAGTGGGCCAGCAGAAATGGAGGG 

CTTC^^CTTTTTGCTTTAAATACTTTTATAATATTTGGAACTTTGAAAAT 

GAGCAGATATATTAGCAAAAAGCCTAAAAGGGATATTTTTGAAATCACTG 

CTAGT7C7AACATATAACTTTCAGCTTGCACACATCATCAATTAACTTTG 

ATAGCGCCTTTCTGAAACTATCATCCCAAATAGCAATCCTTGTAAAAACC 

TATTTTGAAAAACGGGCCTTGTAGGATAGCCTCACAGATGTTTTGTGGTA 

GATTTTCTAACATTCTAATGTCAGGGAGTGAAAGGAATCCCGTTAGAAG7 

TGGAAAATTCTGGAATCTCTATTCATGGTATTAAAGTTTTGCCGTCACAC 

AAAAGTTTAACACCTTTACACAATCAGACTTCCTCATTTTACATTGCTCG 

GTAATTAGAGGAAATCAGTCACCCAGAGCCTGGGTCCTAGACTTGACAAA 

ATGCACCCAACAAATCCTGAGTGGCCTTGCTGAGGACTTCTCCCAGAAGA 

TAGAAAACTCAGTTC CAGC CAACAAGGGGGAAGCAGC7GAAGAAG7G AAA 

TTAACAAAGTCCTGGAAGGAAATGACCAAATCATCTTTGATTGTGTAATA 

ACCAGAGAGTAGAATACAGCTACGACAGACATTTTGGGAGAGAAGCATTT 

TATCATAGCTTTTAGAAGAGAATATTTTTCAGCATCATAAGCACACAATT 

CCAAG AC AGATACTTT CAAGGGATTGTTTTGACG 

>Concig53 

ATGTTNNGGTTTTGGGACCCCATTCAAACTTCATGTTGAATTTTAATCT . 

-AATGTTGAGCGAGGTCCTGTGGGAGGGTGATTGGATCATGGGGGTGGGT 

TCTCCCTTGCTGTTCTCAATGATAGTGAGTGAGTTCTCACAAGACCTGGT 

TAT7TGAAAGTGTGTAGCACCTCTCCCCTTCATTCTCTCACTCGTCACTG 

CTCCGCCATAGTAAGATGTGTGTGTTTCCCCTTTGCCTTCCGCCATGATT 

GTAAGTTTCCTGAAGCCTCCCAGCTATGCTTCCTGTACAGCCTGTAGAAC 

TGTGAATCAGTTAGACCTC'rf rTCTTCATAAATTACCCAGTCTCAGGTCA 

TTCTTTATAGCAGTGTGAGAGTGGATGAATATAGTGCCATATGTTTGTAT 

TCCCAGCTACCCAGGAGGCTGAGGTAAGAGGATTGCTTGAGCCTGGGAGT 

TTAAGGCTGCAGTGAGCCATGACTGTACCACTGCTCTCCAGCCTGGGTGA 

CAGCGAGACCTTGTTTCCAAAAAAAAAAAACCCAAACTGTGTAAAATGTG 

TTCATAAAAGTGTCTTGCTCCCACACCTGTCCCTATATATCTTATTCCTC 

AGCCTCCGACAACTACTrrATTCATTTCTTATGTATCTTCCAGAATCAAA 

AAAAAAAAATa^AATACAAGCACAGTGGAATGTATTGCCCTrCTTCCCCT 

CCCTTTTGTTACATCAGAGTTAGCATATCATAAATACGGTCrGCATTTTC 

T^CTTTTTCAGCTATCAGCATGTTTTGGAGAGGATTTCATATTCGTGCAG 

ACAGCATGTATTAGTCAGTCCTTGCATTGCTATAAGGAAATACCTGAGAC 

TGCATAATTTATAAAGAAAAGAGGTTTAATTGGCTCACAGCTTCGCAGGC 

T-GTTCCACAGGAAGCATGGCAGCATCTGCTTCTGGGGAGGCCTTAGGAAG 

CT77 t AC7CA7GCAGAAGACAAAGCGGGAGTGGATGTC7TA7ATGGCAG\j 

AGCAGGACTGAGAGAGAGAGAGAGAGAGAGAAAGGATGCCACATACTTTT 
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CAAGGTCTGrTGTT^CT^a^S^IP CCC ^^ G,mGACC AA 



[GCGCTTTGAC 

•TTTTTGCATCATT 




GO^TTGAGTCCACATTCAGCACAGGACTCTCTGGGTACAGCTCTCTTTA 



injur 
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CAGGATTGAAGGT7GCA^GCAGTTAAAA«rTATGT7AAATTTATTTACAT 
-AATGCAAAATTGTCAAATAGACCTGTTCCCAGCTrrrCCTAGGGATGGG 
GGCGGGGAGAAGGTGGTTGTCTGGGAATAAGTGGTAGCAGGAGGCTGAGA 
AGGGCTTCATTCCATAGCATTCACTTACCTCCAGCTG7AGAGTGGGCTTA 
-CATCT-TCAACACGCAGGACAGGTACAGATTCTTrrCCTTGAGGCCCAA 
GGCCACAGGTATTTTGTCATTACTTTCTTCTCCTTGTACAAAGGACATGG 
AGAACACCACTGAAGAAAGAAGGGGGTCTTGTGGTTAGGGACACAGCAGT 
GCAGGGTCACCCCAACCCCTAGGCCCCATGAGTAGGATACATGTAATTTG 




C T TGATCAATGTGGAAGGAAA(j«Aiw^w i V»av»vjv_ i vj i«w * * «- * 
AAATCAGGCACCAGAACTGTTTCAGGAACAGAGAGTAGCCCATGGGAAGA 
AACTGGGAGAGGAGAGGCTGAGCTGGGAAAGTGGCTCCAAAGAGAGACAC 
« -.rr^rrrri r artf" a.rtTGTCAATTGGAAGGCCCTGGGA 



TCACTCTTACTACCCGATTCCAAAGAAAwiiawAi i i iui iovjv.wi.s~^*« 
AGAGCAAATAGCTTCCCCCTTGAGTGAGGCTGTCCTTCAAAGTCAGCAGC 
CTTAGTTGCCCACACTCCTGTGCAGAGGCTTTGGCTACTGTGGCACGATG 
CCAGGCAGATCACCACAGCTAATGATGGGTTCACCGCACTTGAAACTTTT 
GCCCGTTACAGCGGAGAGATATAAGTTCCTGCTGGGCGGTAAAATTTCCC 
ACAAGGAACCACCTGGCATTGGGTGGGACGGATGTTGGGGCAAGGGGGG 
AAGACTGGGGAGGGGGATGGACACATTATCGCTCCAGCACTCTTGTTTCA 
GCCTCAACAACAGGAAGAGAGAACCCACAGGCAGTTAGGCCATGTCCATC 
AAATGACCCCATATTGTGGAAGAATTGACATTGCACTATGCCCAAGAGAC 
TTG^GTGGACATGGTCCTGGGAGTGCTTGAGCCGTCTAATTTCTCAGGGT 
CACACTCCTGTTAACAAATGCACTGCKICAGTGCAATCAAATGTGCCATTT 
CTAGGACCAAAGTTTGTATATTCCTTTTTAATATTTTTTTTCACTTGTOT 

TGATCATTTGCCTTAAATTAACTTTCTACrTTGTTTAAAJ^^ 

TAGCAAGCTGCCAGGAAGCCAGGCAGGGAAACCAGGATGTTTCCATTTAC 

OTGTTGCTCCATATCCTGTCCCTGGAGGTGGAGAGCTTTCAOTCA^ 

G^ACCAGACATCACCAAGCrrTTTTGCTGTGAGTCCCGGAGCGTGCAGTT 

C^^TGATCGTACAGGTGCATCGTGCACATAAGCCTCGTTATCCCATGTGT 

tvt ~~~*m„fr>m*. « »«w«w~/~ar!^»r&T«TTrtTTTaGGTATAAAA 




TCAGAAGGGCAGGCCTCGTGAGGCAAo^ i * i ^^^"-r^ 

GGActcCTGAGCATATACGGTCAAAGTCTGATGACAACACCAGWG^T 

GAAGCTGGGAGTGGGGTGGCTAAGAACACTGGACCTGACACTATTAGACA 

GGG CC^GCTTCAGGTCTATTACTGCTCACTGTGGCCGAGCAACAGAG 




CC^CCTCTGGTGACAATGTAAGTGAAAGGCLL-^i^i^^u^^- 
AGTTG C AAAT GTCAGT AGC CATCAAGAT CTT CTTT AAGAAT AGTTTCCAC 
TAAAGAGATGA^GCTTTGGTTTCCAGCCTTCTTTGTTTTGTCTCCCCGC 

TG^G^CTTCTACCTTTAAAGGGCTTTGGCTCTGGGGGAA^ 

^Sgatgac^ccaagaggacacaagtggagatctactgcctg^ 




CATGGC 



TGGAAGGTCTGTGGGCAGGGAACCAetAiu j. iwwiw^***-**. 
•CACAACAACTGACGCGGCCTGCCTGAAGCCCTTGCTGTAGTGGT 

^^™fl^r!«*w^nnrr ATeeAGAGGGCAGAGGTCCAGG 
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AvaAATAAATCCCATTTCTACTTTGTTCATCTT-rTTTGTACATGCCCCArr 
TACACCATACATGTATACCTTCTCTATATCTT7TTGTATCCC AATGCTG 




, x ^^^^ i ^ LTACTCCTCCCCTGT ^ C ^ CC ^GTGTGGCCACC 
^^^^ GT ^TGACTGTGGTGATATGATGACCAAGTGGCCAG 

TAAAACAAAATACCATCWraTr & & irtwrr>r-«r» » 



TAAAACAAAATACCATGGCATCAAAGTGGCCCAGAACTCCCTTCTTTGAG 

^^^ G ^ GTTAGAGCCCT TCCTTGGGTTGGGAGTTAAACCCATAGTC 

TTACCTTCATCTGTTTAGGGCCATCAGCTTCAAAGAACAAGTCATCCTCA 

TTGCCACTGTAATAAAAACAGGGACATGTCTCAATTATGTCTTCTAAACA 

^IIT^Jin CmCCCTGTGTAC ^ GACrrGACTG TTCATAAGAAACT 

GOVAACAGCCTGCCTCTCAAAGCTGCCTGAAACZACCTGGCAAGTTTCACA 

GTGATATGCGCAGAACAGTCCAGAAGGCAGATTCTAGGCCTGGCAGGTGG 
GCACCCTGGGTGCTCCCTGTTGGATCTTCar^rrTaar^rT^^^o^,^^ 



* «ul tflrtnniL iUA^CTCTCCCTCTCCCTCCAAGCCACACTTTGC 
AAAGGGA7TCCTTGTATTGTGGGCTTGGAATCTTTTCTCCCCATTTGCCT 
CTGCAGGAAGCCC7TGCAACAACACATCTGGATAGCCTCCAGGTCCCAAG 



GCTGGAGGGACTTGTAATGGGAAAGTAGTCTTTAAATCAGATTTACTTGG 
CACCCTGTTTGCCACTGAAAGAGGCAATTTAGGGGAAAAATCTGGTCTCC 
AAGCACAGATAACACTCTACTCTTGAAAGAGGAGACCTGCTCATGTTACT 
GGTCTCAGCG7CTCCACTGACCTGTAATAAGCCATCATTTCACTGGCGAG 
CTCAGGTACTTCTGCCATGGCTGCTTCAGACACCTGTGTAAAAAGGAGAA 
AATGAGTGACTTCCCCATGACGGCTACGTTCATGTGTGATTTCTCTCAGC 
ATCCAGTGCATGGCAGTCATGCAAAGAAATGATCTCTGAGTAAATGAATG 
AATGTGTGAAAGAGAAGTCCTTTGGGTCTAGAGAAAAGCATTTGCTAAAC 




i i «<jvjv*iiai.v«A(jt iAvSATAAGCAGTATCCATTCCCA 

GAATATTTCCCGAGTCATAAGCATTATATTACACCTGGCATTTTTGCAAA 

AAGCTGAGAGAGGGAGGCAGAGAGGGAAGGAGAGGGAGAGACAGAGAAAG 

AAAGAGAGAGAGAGAGAGAATATGCATACACACAAAGAGGCAGAGAGACA 

GAGAGACTCCCTTAGCACCTAGTTGTAAGGAAGATTAAAGTCATAC^TGA 

GCAATGAAGATTGGCTGAAGAGAATCCCAGAGCAGCCTGTTGTGCCTTGT 

GCCrCGAAGAGGTTTGGTATCTGCCAGTTTCTCCCTCGCTGTTTTTATAG 
CTTTCAAAAGCAGAAGTAGGAefSPTrtArs&a ai-rrw^wrwn . m. ~~~~ 




^- * 4w^wv^w«Aui iA^vjo/uiRWjVj<jAAAA(»GTATTGGTGGAAGCTT 

:ttaggggaggggactaataaactgagataattctctggttcatggaagg 
3caaggagtagcaaactatgacacattttgcaaatgtatcaccatgcaaa 

7ATGCATTGTTTTCCTGACAATCGTTGTGCAGTTGATGTCCACATTAAAA 
TACTGGATT7TCCCACGTTAGAAGAATGTTTAAATTTAGTATATGTGGGA 
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3GTACAATGAAGGGCCAATAGCCCTCCCTGTCTGTATTGAGGGTGTGGui 

~- CTACC^TGGGTGCTGTTCTCTGCCTCGGGAGCTCTCTGTCAATTGvJVG 

GAGCC7CTGAGGAGAAAATTGACCTTTCTTGGCTGGGGCAGAGAACATAC 

GGTATGCAGGGTTCAGGCTCCTGACGGAGTTGGGGCAACCCTGGAGATAA 

GC* rr *ACACAACCCTGCAAGACCAGGTGCTGTTACCCTAGCCAATCTCAi.G 

GATGAACCAGATCAATGCCAGATGAGCTCTGCCTAAAATGATTTTTTGGT 

GAACTC'GAAAAGTGGAATATTGTTTCTGTAAGAATATCCATCTGAGACi 

CTATCTCTTGGTAATACCAAGAGTTATCAGTTTCTCTTTAACCGAGACAC 

CAGCAAAGTGCCTGCTCCAGGGTACTGCCCAGGGGAGCCCTCCATTTGTA 

GAATGAATGAGAGTCCAGGTTATGAACAGTGCCTGGAGTGTAGGAACACC 

CTCCTTTGCCTCTTTGACAGGTCTGCATCATAACACTTTTTTTTTTTTTT 

Z.1, ^« /.»^*r-<rrTrA(-rrTr.Trr.rrfAGGCTGGAGTGCAGTGGCACGATC 



^GAGCCAAGTCTAGCACAGTGCCTGGCATGTACATTCAGGTGGTAGAG 
^GCTGCTTGAATGGGTGAATGGGAATTTGACAGCATTTTTATTCAAAT 
-AGTATGTGCCAGGTATCGTGCTCGCTCTGCATTATCCAAGGGAGTGAGC 
CTCTG"GCAAGTATTTGAGACACGAGGGAAATAGG7TCTACTGTGGGAAA 
AAGAGCATTTCATGGACTTGCTCTCCAAGCAGCCTTCTC^TTAATrT 
GGCTCCO^GTATCTTGATATCAGGAGTCAGTCACAAGAACTCCATCTTTA 
GTAAGTTATATTTTCCACAGGAAATCTAAAAGCTGTTCAACATGCTAGTT 
TCCTGTGAATTTGATAAGCCATAATCCATTCCTAACACTGAGCCCTCCTG 
AAATTTGGTGTCTGGTCCTGCAGATAGCTAAAAGCCCTGTCTGGGTGGCC 
TAGGGGACTCCTCTGTrTTGCCTCCACAGGATCCACTTTGCAAACTAACC 




T C rri'TLl l 'TTCrTTCr nV 'rT T CTTTCTlTCTTTCTTTC^CTTTCTT 
TCCTTCTTTCTrCCTTrCTTTCTCTTTCTCTCTTTCTCTCTTTCTCTTTC 




ACTA^GCCTTTCTCAGCTGCAAGTTCCTCTTTACCCTGT^^WHi^ 
^CCAAGACAGGAGACTGACATTTATTCAAAGCAGCAAGTGCCCTGATAC 
^^GTGT^AATCATGGGCrrCGCAGCCAGTTATCAAGGTTGATCTC 

^-^GTC^CAATCA^ 
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TGGGTTAGTTCTTATATTATTGTGTGTACATGCAGTGATGTCTGTTCTT r 

jCAATCTGCCCCCAACATTTTCCCCAATTTCCCATCTCATTCTTGGCACT 
jGCTTCCTAATATTTGTTCTTATGAGTCATTTTCTTGTATCATTTCCATG 




*-^wAv-v» A i iuv-Li xu^Li I uTGQGATTATTTGG 

GAATGGGCACTATGATTTTTATCATATCGCTTCCACTTCCTTTATGGCAT 
CATCTCCAATGGGCTTCTTCTCCCTCTT^^ 



^w^. w^<rw~w ^^nnwurtMUii i lCTCCTCCZCTGGTCTAGAACAA 
GGAGGGCTTAGATATATGAGCAGGTGGCTGGGGCTGGCGAGCTATGTAGT 
CTCCAATGGCTTTTCCCTGATGTCGGAGTTGTTATGTCAGTTCTGGGAGA 
CCAATAAGACCTTGTCCTTCCTTTGGATCCATCAGAAAAAGCCCCTGGGT 




ATCCAGGGTCTGGAAACATTTTCTGTAAAGGGCCAGATAATAAATGTTTC 
AGGTACAACTACTCAACCTTGCATCATTTCAGAAAAGCAGTCHGATAATA 
CATAAATGAATGGGTGTGGCTGGACTTGTCCTGCGGTCCCCTGTCTTATA 
TCATTGTATTATATCATTTTTTCTTACATACAAATTTAGAAGCAATACTT 




'^i^ uni u iAi rAACTCTTCATAATAACCTTTAAAGTAGA 
TAATATTGAACCATTTGACCTATGCAGAAACTGAGGTTGAGACAATiVAAT 
TATTTAAGACCGCACAAACAGTAAATGCTGGAACTACGACTCAAATATGG 
GTTAACTGAACCAAAACCAGATCTTTATTTCTCACTTTTAATTGTTAGAT 
ATGTTTATTGCCTCATCTCCTGTCCACATGGTGCCCATCGGCAGACTCCT 
TTCTCATTCTCAGTGATTGAGTGACATTCTAAACTACATTGGCCTGGCAG 




-s.w-^-^ v* **w ^ wj. v. a 4 uuv. i 1 1 1 w 1 i w i i wTGGCGGTGACGTG 

CTGTGTGA ATTTG TTTCTTTCTCCTCTCAGGGTAGTACTGGGACTTTCCA 

AATCAGGGTTTCTAATGATCTCTCTTCNCTTTTCTGAATTTCTTCCTTAT 

TCCCATTCACTTTCTCATCTATAAGTGGCANCTTTGTTGCTGGAAGATAT 

CCCTTGTGCAGGGATTNCTCTTTAANAATTTGTCNNNACC 
>ContiaS4 




^*wvjv.iu^wAi Annv.^.wiV3MAHi^i 1 vj^L-tiAtiACAGGAGGCCGTGGCCC 
AAGTTCCTGGAATGGGGTATTATTATGTCAGCACAAAGGCCTTTGCACAA 
ATGAAGGCTTTAAAAATGCAGTCCTAGTCAGGTGGAGGAGGGCTTATAGG 
ATTCCCAGGAATCTGGATCATTCTCTTGAGAGCTTTCCCTTGTCTCTGTT 
AAAACTCACATCGTACGGCCCAAATAACAACAAAAAATGGATGTAAATTC 
TTGAAATAACTTGTGGATGGGGGAACAAGGCCCACCCCCCAGATCTGCCA 
GAAGCTTCAGGTGAGGGTCCCAAATGCCAAAAAGTCTGGTATCAGAGAGG 
ATGGC CAGTGACNTGGGGACACATGC CCTTTGCTGTGTCACTCAAGGAGC 




CTCCT CGTGTC AGCTTACCTGGCTTTGCTGCGAAGAGGCCACTTGCATTT 
CTTTATTTTTTATATTTTTTTAATO^ 

TTTTTATTTATTTATTTATTTTTAATTTTTTTCT 

CTTTAAGTTTTAGGGTACATGTGCACATTGTGCAGGTTAGTTACATACGC 
ATACATGCGCCATGCTGGTGCGCTGCACCCACTAACTCGTCATCTAGCAT 
TAGGTATATCTCCCAGGTTAATCCCTCCCCCCCTCCCCCCACCCCACAAC 
AGTCCCCAGAATGTGATGTTCCCCTTCCTGTGTCCATGTGATCTCATTGA 
ATTTCTTTAAAGGTGGAATCTCTCAGTGGGGTCTAATCTGTTCAGAAATA 
TCAAAAGAGTATCCTTGGGAATGACTGGAATTCCAGAGTCATCTGGTAAT 
CCTCATAAAACAACTCCTGGATGTCTCTCAGCACATCTCCCACCTTGAAC 
GCAGGA GGCTGGTTCAAATGGAGGAGCATCGCTCTACTGCACTTTTTTTT 
TTTTTTGGCCTAAAGTGCAAAAGGGGATACGTTTCATGTAAATAAATCAA 
CTGCAAATCG CTAGTT ATGCTGAGCCCTGTCCCGTGCTGTGGACACAAAG 
GAACCAAAGGCTTTTCTCCCCGCCCAACACACACATAACACACACACAAA 
ATCATAAAAACATACATACCCCCAACACATAACAACACACAACACACACA 
ZAAAATATATACACACAACACACACCAAACATGCCCACAAACCTGTGTCC 
.^^AAATAAATCCTACTGGTGGGTTTGTGGTCTCCCTAACTTCAAAAATGA 
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AGCCGTGGACCTTCGCAUTGAGTGTTACAGCTCT7AAAGATGGCATGGAT 
C Z AAAGAGTGAGCAGTAGCAACGTTTACTGTGAAGAGCAAAAGGACAAAG 
""""CCACAACCCAGAAGGGGACCCCAGCAGGGTTGCTGGTTGGGGTGGCC 
AGCT^TACTTCCTTTTGGCCCCTCCCATGTTCTGTTTCCATCCTATCAG 
AGTGCCCTTTTTTCAATCCTCCCTGTGATTGGCTACTTTTAGAATCCTGC 
^GATTGGTGCATTTTACAGAGTGCTGATTGGTGCGTTTTACAATCCCCTT 




CAGAAAAu j. _V_v.v_ J. us- v- *_n\- i. vjvj«<_ v. * • • 

AC C^""T-CAACTCCATAATGGCATGAAAATACATATGTTGTACAAAACATA 
CA T ACACAAAG7ATACATGCATCTCCCCAAATATACACATACCACAGAAA 
-ATACACACAGGAACTCAGCTACCTGTCAAAAGTCTGCATGGTGATTGCC 
-C^GCAGTGAGTAGTTAGAAAAGTGAATTTGTTTTTCAATAAATTGGAGT 
" C^AAAAATCGTTGT AAGATAGAAAATTTTTAAAAGT ATATAAAATAAA 




T GGAGT AATCAATeATATAl\iUAAA(jAl 1 J.«j'j«v-*/\>J<-nj.«..L i«~w.~w~ 

GAATTATGTATGCATATGTGTGTGTATATATATATATATCTQATACATAT 
AATAATGTAAAAGTGAAAATAACTCAGATGTTCAAAATTGAGGATTAGTT 
AGACTATGATCTGTCCATATGTGACATACAAGTTAGCTGCCCCTTATTCT 
C^CGAGCT-CAACCTCCTATAAACAGTGTCCCTTGTATATCAGTATTGGT 
ACAGATAATCGAACTTATTGAGGTTTTACATGGGGCAATAAAGGCAAGAG 




ACAAGTC""CTATTGCAACTGCCTCAACATGGCACCCTCCCTGCATCTCCA 
-C^CCTGTCCTGAGAGCAATGGCCTGCTGCCCCCACACTCACATCCTC 
iT^CATTCCAGAAGTGAGCACCACAGAAGTGCCTACAGTTACCCCAACCA 
CC^CTTAGAAGATAAGTTAGTGTTTGTTTTGACTTTTTAAAATTTTTAC 




TTGTATTCTAGCCATCAAGGGAATAACATTrTTCCAGGTCTTTAGACAAA 
TAATGGAATACCTTGCAGTAATTAGATACACTATTGTAGAAAAGTATTGA 
TGAAATGGAACGATGTTTGAGATATCATATTGAGTAGAAAAGGCAAGATA 
CATTAAGTAGGAAATGTATCTTACAAAATAATTTGTCAGACACACTCCTA 
TATTTGTATGTTATATAAATGCGTATGTGAAGAAAGGCTAGAGGATGAGA 
_„„„,,,,, »^^t-t-t» anartaTnafirtrTGCAGCATGCTCAGAAA 



GAGGATGCTAT^ATTATTGGCCAAAGGAATACTTGTGTTGTATWGCA 
^cScTCACAAACTGTTGATTACAAATGAGTACGVGACCTAGCTCCTTC 

A^G^CACAC^CCACGAGC^TCCCTC^CCAGGTAGGTCWTCCTCCTG 




tGGAGGGCCAGGCAGACACTAGCTTAG 
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CCTCCCAATGAAGCAAGTCACGTGAGTCAATCCTACCCTAAGATATTAGG 
GATTGAGCCTCCTG GGACA TTTGGTGGCTTAGGTTTTCATGAAAAGAGGT 
TGCAGAGCAACTGCTTTTTGTTAGGCAAAGATTAGGCTACTGCAGAGACT 
CAGCAAACTTCTATAGAAGGTGTCAGATGGTAAGTATTTTAGGCTTTGCT 
TGCCAGATGATCTCTCAACTAGTTAACCATGCTATTGTAGCCTCGAAGCA 
GCCAGAGACAATATGTAAACAAGAGCATGGCTGTG7TTCAATAAAACTTT 
ATTTAAAAAAACAGTCAGGGACCGGATTTGGCCAAAGGCCATAGTGTGCC 
AGC C C CAAGACTAGAGCAATGCACTTTTAACTTTTTTATTTTATTTTTGT 
AAAATGCCAAGATCCACAAAAATGCTATTGCACCCCGTGTGTTAGCACTG 
TGACTCAAGGTTTGGGAAATTCTGCTTTGAAGGCGTGATAGACAGGAGAG 
CATGGTCTGGCCCCTTGGTGCCTTTCTGGTTGCAGCGAGCATTTCAAACT 
ACAGAGCAAGGCCAGTGGTCTGTTCAGCACTAGAGACATGCAGCAAGGTG 
TCCTGGGGTGAGAAGATGCCATAACTGGTCCCCTTTCTATCTCCTTAGGT 
CTTGGACTTCATTCCATTTTCTGTTGAGTAATAAACTCAAC^ 
GTCCTTTGTGGGGGAGAACTCAGGAGTGAAAATGGGCTCTGAGGACTGGG 
AAAAAGATGAACCCCAGTGCTGCTTAGAAGGTAAGGTTCTTGTAGAAATC 
TACCTCAGGGCCAAAGTGTAATTCCTAGAGCAGAACTTTGCTAGGTGCTG 
TGCACAGACCCAGTTGTTTCCTGCTGACTTGCACAGTAAGTGAGCTTTCA 
AATTTCCCTGGACAAATAACTAGACAAGAGAAATTCTGGAAGAGAAAAGG 
AAGCTTTGCTTCAGTGTCCAGGCACATCAGGTAGTAGATAAAAGGATCGT 
C CTCAC C7ACAGATTTGGGGCTTTAGCATC CTGTTTGCCAACTGGATGGT 
TGCATArGCTTCAAAATGCACCTCTTCCCTCCCAACATTCCCAAGTGGAA 
3AGAAGC C7CCGATGAGAAGGAACTC7CTAAGGCTGGGCTGAACAAATGA 
CCCAGGCACAGGGCATCTGAGTATTCCATGAGGAACACATTTGGGTGTTG 
CCCATGGGGGACAATAGGAGGAGGCTTTTGACCCAAATGATTGTCTACTG 
AGGTGTGACGGGAGAGGCCTGTGACATGCCAGAGGCCAAACCCGTGATCC 
AGTTCATCTCTATTCTATGTTTCTGAAGAGGGAAGCTATGATTTAATGTC 
ATTACTATCATGCTGCTCTAGTATTTCTCAGCACATACACAGAAGAGGGA 
ATTAAATGGTCCTTGATACCCCTAAATCCTTGGAAAATCCGAATTGCATA 
TGCTAACCTCACTGCGTCTGACTGCAGACCCGGCTGTAAGCCCCCTGGAA 
CCAGGCCCAAGCCTCCCCGCCATGAATTTTGTTCACACAAGTAAGGCCTC 
GGGGTGAGGTGATGGGGGTGGCTGAGGTGCGAGGGTGGGGATGGGGGATG 
GAGCCATTGGGTCCTCTTACAGGGTGAGAGAATTGTAGAATGGGGACACC 
TAAGGGTGCTGGATGGGGCTGAAGTCTTTCCTTTGTGGAAGCAAATCCCA 
TTAGGAGATAACTCTGGGAAAGATGAGCCCGGGGAGGGGCAGGTGATGCT 
CACCTGCTAAGAGGCAAAGGGCAAGGAAGAGTTTGTGCCTGGGAACCTTC 
CAGGTGCCTCTTCTGACCATAGCCAAGAGACTGGAGACACAGACCTCCTC 
CCAGCACTGAGGACAAACAGCCATGGGGCCAGTGGGGGTGCAGGGACACC 
CACACCACTAAGGGCTCAGGGCGGCGCCTTCAGAGCCTGAACCTTCCTCT 
CATGCTGCCATTTGAACACCACAACACCCTAATAGGAAACTGTTAACATT 
GCCACTGTTCAGGTGTGGAAACCGAGACAGACAGTGGAGATTCCCTGCCC 
TAGGTGACACAGGTAATAAGTGACAGATGTGGAAATTTAAAGGTACTATA 
ACGTCTC-TCTGCCTGACTCAGGCTTAAGGCTCCCATCACCTCCTCTTCTC 
AGGACAGAGTCAGGAGGCCTCAGCCTGAGCCCCAGCTCTAGTGCAGGTTC 
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TC^CCACCGCTGCTGCTGCAGTTACCTTTGATGTTTTAGTTTTGTTGTAG 
TTACACCATTGCTGGCTTTGGATCTGCACTGTG7CCACTCCAGGTGGAAC 




GGGGCTGCTTTAACAAGCCTTCCAGGTTATCGTGACGCACCTTGAAAGTC 
TGAGAGC7ACTGCCCTACAGAAAGTTACTAGTGCCCTAAAGCTGGCGCTG 
GCACTGATGTTACTGCTGCTGTTGGAGTACAACTTCCCTATAGAAAACAA 
CTGCCAGCACCTTAAGACCACTCACACCTTCAGAGTGGCCTTGAGAAAGA 
^TTGGGGTCAAGGATCATGAGCGAGAACACCACTTAAGAGGATAGTGAAC 
TAGTCTGCATGTGAGACGCTGAGATCCTATGTCAGGCTGTGATAGGAGGG 

AAACAGAAACCAAAGGAAAGAACAGCTTTAAGAAG^ 
AAGTAAAATGATGGTGCTAGAAAAGTAGCTTCTTAAAAAGAG^TTTTCC 
AGTCTCACCCTGGACTAACTGAATGAGAATCTCAGGAGTGTGAGGCCCAG 
A ^TEI-.«^i^»;^«or i rrr i rraGGTGATTCCCAGTGTGCACC 




AGGGGTGAG AGTCACAGCCTTAGGCCATGt uau i uwi^^ 
ACCAGCAGCACCCACAGCTCTGGGAGTGCATCAGAAAGACAGAGGCTTGG 
S^CCCACACCTACTGAACCATAGTTTGCAGGTGATTTCTTGCACATT 

SSgTGTGGGAAATGGAAAACOTA^ 

ACTCAACCTGCACCTGCTCCATGAACTCAGACTGCCTGGGATGGGCCCAG 




CTCATCTGACCCCATATCACTGGGGAGTTAv-i i auvja i v. i * 
CAGTCATCTCTTCCATAGACACTGAGAGTGTCCACGATGCTTGGGGCACT 

A^AGGGTGGGAGGTGGAGGATCACGGGTGAGTCAGATAGGAAGCCTGCTC 

^GGGAGCTTACAGTGCTATAGGGCAGCAAGCCAAGGATGC^TACCT 

rTGTGCAGGTACCACTGACGAGTGCAGAGCGCTGCAGCACCAGAGAGGAA 




GCAGGGATCTGTTCCACTf lAUW-Av. ha, r 



CTTA 



ATAGCAGTTCCAGATAAAAACTACATACGCCCAGCTGACTCT^TT^ 

GGCAGACAC^AGA^TGGAAGAAGAGGAGGCTTAAACCG 
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GGCGTGCTGCTGTAGGGTAGGAGGGTGTGAGGGAGCACTCGGAGGGCAGT 
GTGTCTGCCCTGCAAATTTAGTCCTGGATGGAGCATCCTTTCACTTGAGG 
GGAGAAATCTTAGGAAGCTGAATTAGATACAGATCTAAGCCATATTCTCT 
AATT7TAAAAACTATAGAGCTGAGATTTTGGTATCCATCTGACTCTTACG 
TCTCTCTCTCTCTCTCTCTCTCTCTCAGTTTATTTTTAATCTGGGGGACA 
AGAAGGCCTGGAAAAGAGGGCATGATTGCTTATCATCCCTTAAATACCAG 
TACCAAGGCTGACACGTCATCTTTCCCAAGGACCATCTGCCTTCTCTCTT 
TTCCTCCTCTCCTGTGTAAAGGCCTGGAGGATGAGCACATGTGCTGTGTT 
TTCCTCCCTCTCAAAGCCTGTGCTATCTAATTAATCCCTTTTACCTCACA 
GAAGGAGAAACTGATGAAGCTGGCTGCCCAAAAGGAATCAGCACGCCGGC 
CCTTCATCTTTTATAGGGCTCAGGTGGGCTCCTGGAACATGCTGGAGTCG 
GCGGCTCACCCCGGATGGTTCATCTGCACCTCCTGCAATTGTAATGAGCC 
TGTTGGGGTGACAGATAAATTTGAGAACAGGAAAC^CATTGAATrnrCAT 
TTCAACCAGTTTGCAAAGCTGAAATGAGCCCCAGTGAGGTCAGCGATTAG 
GAAACTGCCCCATTGAACGCCTTCCTCGCTAATTTGAACTAATTGTATAA 
AAACACCAAACCTGCTCACTAAACTTTCTGTCATTGGGTTTCATTTCTCA 
TTCATGCTTTAAGGATTTGTGTTTTTAGGATATAGCAAGAAGCTTGTTTA 
ATTACAAAGTTCTGGGTTGGAAAGAGACCGGCTTCTGCTTGTGTACTGCT 
ACCCTGAACCATCAGACATGCATGTGTGTGTCATATGCTATGATGTGGCC 
AGTCTGAGTGCAATACTTGCAGCGGGAAGGAGCAGCTGGGTGCATGCTGT 
GCTCTAGAATTAGTCTTTCCTACTGGGGTTTGGTAGATTCTGAGGGCATT 
GATCCTGGGGCAGAAGTGGCTGAGTCTGTGTCTAGGGTACAGTGTGCAAG 
AAAGAAATGTAACAGCAAGTCACAATCCAGCCAAGTGATAGTGGAAAAGG 
GGTAGTTAGGTCCCAGATAAGGAGCAGGGTGACTTGACCTGTGGGAAAGG 
CACAGAGACAAGGAATCTGGGTCAGATGACAGCCAGGAGACCAGG7GAGG 
GAGGAGCCAGGTACTGTCTGGGAGGCTTGTCAACAAGGGCATGGTCCTAT 
CACTAAGCAGGGCTCAGATCCTCATAATGGGGGAGTGGAAGGCTGGCCGA 
ACAGAAATCAGGGCCTGGAAACAGAGTGAGGGGGTGGAGACAGGAGACTG 
AGGCTTGGAAATTAGTTTATTAGTTTTAGCTCTTCAGTTACAAGCAATAA 
TAATAGCTTCTAGCTTATTTAAGCAACAAGTATACTACAAAAGGAGCTTT 
CTAGAAGGATATTGGGTATATTCATTTCTTACTGCTGCTGTAACAAATTA 
CC^CCAACTTAGTGGTTTAAACAATGCAATGTATTATCTTGCAGTTATGG 
AGGTCAGTCTGGAATGTGTCTCACTGGGCCAAAATCAAAGTATCAGCAGG 
ATAGCATTGCTTTGGGAGGCTCTAGGGGAGAGTCAATTTCCTTGCCTTTT 
CCAGCTTCCAGAG GCCAC CTGCATTCCTTGGCTAGTGGCCCACTCCCATC 
TTCGCTGCTTGGGTTTTTCTCACACTGCTTTGCTCTGACCCTCCTGCCTT 
CCTCTTTCACATATAAGAACGCTTGCAATTTACATCGGGCTCACGTCAAT 
ATCCAGGATACTCTCCCGTCTCAAAGAGGCTTAACTTTAATCACAGATGC 
AAAGTCCCTTTTGCTATGTCATGTAACATATACACAGGGTCTGGGGATTA 
GAATGTGGACATTTTCGGGGTGCCATTATTCTGCCTATCATGTGAAGTAA 
CTTTCAAAATGGAAAGACATGCTGAAGAAAAAGTCAGGGATTTCTGGCAG 
GCCAGAAATGACAGAAGGCAGAAAACGTTGGTCCCATCACTCAGATGGGT 
AAGAGCCAATCATGCTTTTTGTCAGTTAGCAAAAGATTGAGATTCCAAGC 
AAAGCATGCAACTGCCCTAGTTTGGGTCATGTGTCGACTCCTTGGTCAGT 
GAAGGGCAGCACACCTTGATCAATACTCCCTCCAAGACTGTATCCAACGA 
GGCCAGTGATGTTCCTCAAAGCAGAGCTAGAGAGCTAATCCCAGGAGAGA 
GGCGTGTGGGTGGTGGGCAGGAAGACAAAGCTCAGCCGTAAAGGAGTAGT 
AGGGACAGCACCCTAGGCATGGAGGCTCAAGTGAGATGATACCCATGGGA 
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CAT C C? AC CTTGATCATTACACATTC CGTGCATGT AATGAGTACTT GCAT 
GTATGCCATAAATATGTGAAATATTATGTATCACTATATAAAAGAAAAAA 
AAATGTGGCCAGGTGACATCCATATTTTGGAGAGGAAGGCATGTCTTCTT 
CATAATATCACAAAACTATTTTCACAACAAAGACACAGCTGTTCAAATTA 
GTCTCTGAGCCGGGGCTGTCTCATGGCAGTGAGGACTCTGGTTCCCTTAC 
AGACTAGCAGAAAGGAGATGGGGCTTACTGACCATGGCCTTGAGGAGGCT 




ULWoW • k ^WW*WW A W*** - 

C^CATGCTTCCTGTGAGGCTCCACCAAGACAGCAAGTGCATCAACACCTT 
ACGGAAGCACAAGGCCCTGTTTGTTGTTGACTTCATGAAAGGCATGGTTG 




CAGGCTCTAGACTGACTCCAGGATGAGTATTTGGAAGCTGAAfiTCAATCT 
GTGGTCTCTTCTCCTGTAGAGCAGGAGTCAGCACTTTTCATAGAGTGCCA 
GATTCTATATATCCTGCCACATGCTCTGTTGTTACAGAACAAAGAAGGCC 




1 1 i inwL 1 uLAu 1 1 1 y \, Cnn^^ ^ w * w wiw * now**- ™. - - 

iGTGGTGGTGTGCTGGAGCTAGCTTATATCAGCTTGCAATAGCC 




CTACTGACTCCCCTTTGCCCTGTCTTATTTTTCTCACTCTAACATGCTGi 

ATAGTT^TCTTCTTACATTTATTGTTTGTGTCTTCCACTAGCATGTATGT 

CCCACAAGTTCTTTGCTCTGTGATGTATCCCAAGAACCCACTGCAGTGCT 

TGGCACTTGTAGGAACTCCATAAGATTTTTATAAATGAAGAAAGGAAGAA 

AAAAGAGAGGGAGGGAAAAAGGAAAGGAAGCCTTCTATTTAAATGATGGC 

CTTCTCCATATTTCTATAGTAATATGACTTCCCTTGCAAAGGGGGATGCA 

XTTTGGAAAATGTGTATAAATAAACTCAGGTGGTTTTGAATTTCATTTTC 

CTAACTGTAATTCTAATCATTGGTCTTTATGTTTAGTGAAAAAGTTTTGG 

CCCTTATGCCTCACACCTGAGAATCCCAAAGTATTGGTTTGTWGAGCTC 

CCATAGAGAACCATAAACTGGGTGGCTTAAAACAACAGAAATGTATCGTC 

TCCTGGTrCAGGAGGCCAAAGTCTGAACTCCAGGTGTTGGTTCATTCTGA 

GAGCTCTGAGAGAGAATCTGTTCCAGGCTTCCCTTCAGTTTGTGGTAGCT 

C CAGGGTT CCTTGGCTGGTGGCAGCAAAACTCCAGTCTCTGCCCCCATCT 

-"-ACATGACTGTCTTCTCTCTGTGTTTCTGTGTCCAGATTGTCCTATAAG 

GACAGAGTCATACTGAATTAGGGCTCACTCGAATGACTTCATCTrAAGTT 

3AACTGTATCTCTAAAGACCTTATTTCCAAGTAAGGTCACATTCACAGCT 

ACTGGGGGATAGGACCTCAACATATCTTTTTGGGGGACATAATTCAACTC 

ATAATACCCAACATGATAACTGTTCATCCCATGAAATTTAATGTCTCTCA 

AAAGGTGATCTCAGGGCATTTAATCTGTGACAGAAACTCCCATAGGAAAC 

?___ »^~t.^wi«r^»^mr!r*pr!ri'rrirTPCTCCTACCCCATCC 




AGCTATCCAGCATAUUA.iAiiw»i«««-i.»-«^ i * ".1 

AACATTTTTTrGGTCCCAGTTATCCTAATCAATTAAACAAACTCTAGAA 

CCATC^GAAGTGCAGGCATTGGCSACATTATGAAACTTACACAGAATTCA 




AGCA^GCACCACCGCTACCTTrAAGCTCCTTGTGTrAGTGCAAGGGT 

GCAAACTGCAGCCTAAA(^CAAATAC^GCTTACTGCC7GTTTT^^ 

ttg^cScACAGCCATGCTGTCTATGGCCTTCTTTGTTCTGTAACAACAG 

AGCATCTGGCAGGATATATAGAATCTGGCAGTCTTTAATAAGTGCTGACT 

P^^i^i^arAGGAGAACACAGATTGTCTTCAGCTTCCAAACATTCATCT 
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AGGGCTGCAATGCATGA'iAGTTGGGGTTTTCACCTCTCACCCAAAAGCCT 
ACTCAATTTTTTACTGCAAAAACATGTTATCATCATTATTTTTTACTTAG 
C CCAC CTTTC GTTGGCAATTTTCCATAGGAAAATGCATTCTAAATTTCAA 
CTAATCAGGGGACTTGGAGCCTCTGGACACCCCCTTGTTCCTTGCCCACA 
GTCCCTTGCAGAAGGTGCCTTATCAGAGCGGCTCCATGCAGGGGCTCAGG 
ACAGGATCAGATGTCAGTTGCACCAAGGGGGCAGGGACAGATCCTCTCTG 
CTfiACCATGCAGAAGGGACTGTTCAGTGCACCGTCATGGTCCTGGTGATT 
T CTGGTC CATAAGGGAATTTTCACATGCATCGGGTGATTGTCACATCAGC 
ACAACACTGTGAGGAAGGCAGAGTGAGAATTTGTGTGCCCATTTTATAGG 
TGAGAAAACAGATGCAGAGACATTAAGTAACTTCACCACAGTCATGCGGG 
TTTTAAGTGGCAGACTTTCAGGTGT7GTGACTCCTAGTCCAGAGTTCTTT 
GCACTGCCCCTGAGGTGCTAAAACTCTACTGTGCTTTAAGACTCACTTGG 
GGAGCTTCCTAAAAAGAGAGATTGCACAACCTGAGATTCTTGTTTAACTG 
TTTTGGGATGTAGCTGVGGGATCTAGCTGCCTTAAAAAAAAAAACTCCCA 
AGTAATT CTGATGCAAGCGGTTCTTTTTTGTCCACCTTTGAAGAAACACT 
GCCTCCTCCCCATACATTTCATTAGAAAATGGTAACATGTTTTTCAGCCT 
GAGAGCCATTTCTGGGTGACCGGACGTCGGCAGCCCGCTGTACTAGCTTT 
CAGTCTAGGCTTAAACACACATGATAGGAGATGTCCTACTCCAGATGATA 
TGAGTCTGAACCATGGAAAAATTCCATTGTGTGGCACATCTGGTGGGTGT 
GCACTGTCCCCAGCAGTGAGGCACCCAGTGAAGACAGCAGCTGGGAGAGG 
CTTAGTTACATGCAGTGGGACAGTGTGGGCTAGACTGCTGAGCCCtCTGC 
AGTTTACTCTGTGTCAGGCAATGAGGGTGAAAGGCTGATCAGACCCACGT 
GCAGAC CATACCCTCCAGGGAGACAGATATCAGTCAGGACAACCCCAAGT 
GTAGCTGGAGAAGCAG7GCCCAGGTATGACCGGATGTGTATCCAACCAGG 
AAATCTGCATATAAATATAAGAGGAGAAAATGAACAGATGTTGCTCTTAT 
ATGTAGATATTTATGAAGAGCATATAATTTTGTTTTGTGTGTTTTAAGAA 
GTTTATAAGTATGCCTTAAAAATGTATAGTATATACTGTAGGTATTTTTT 
CCATTAGATATTTTGTTTTTCATACTTATCCACATTGACATTGTAGCAAC 
AGTATAATATAACAACCTCCTCTACAAAAGCAGAAGGAAGTGAAGCTTTG 
GAAGGAAGCACCCAGTGAGCTTGCCCCTTTCAGGTGGGTGCAGTGAGCAG 
GAGTCAGTGAGGTTGAGATCCTTTGAGAGGAGGCAATCATTAACCAGGAA 
ATCTGCACTGCATCCTGGCCACACCTAACCCTTGGACAATGGTGCTTGGA 
GCGCCTTCCAGCTCTTAAGGCTTGCGATTTCTTTCTCTCACTCTTCACCC 
ACGATGATTAAATCTTCTCCTACAGAGTTGGACAATAAAGCCTTGAGTTC 
CTGCCTCCCCTGGTGTGATCACGAGGCATAGACATGGCCAGGAACATGTA 
GGTGTCTTTGAAAGCTGAACAAGTTAGTAAATTTCAAACCTCATTTCACC 
CACCAGTAAAATGGGAATAATAATAAACCTATTTTACATAGGGTTGACAA 
GAGGAGTAAAGAGGGATTCAATGAAAGTTCGTTATTATCATTTGTAGTAG 
CAGTGTTGATAATATCAACTGAAAGTTCATTATCATTATTAGTAGCAGTA 
TTGATAACCCTCTTTTCTGTGCCTTCTCACTGGTGGGCCCAGGCCATCAG 
CAATGCCCAGGGTGTCATGGATCTCTGCTGCATCGGGCACCAGCTGTGTC 
AATGGTGAGAACAGTACAAGGGTGGGCAGGGCAAGGCAGGAAGCACCCAG 
GAGCAGCAGCTTCATGGGGTGAAGATGTCAGGAGCTTAGGGACAGTCAGA 
GCGGGTGTGCCTCCTCTTGTGGAGCCTTTCTGCGTGGGTAGGAACTGCTG 
CAGCTGTGGCCATGGATTCACCTGAATATGGGTGGAATTAGGCATTCAGC 
TGGGTTAGCTGTGCCTAGAAGGAGGAACTCTAAACTGAGAACTTGTCCCT 
ATTGCCACCTCTGATAGGCAGATGATCCATCCATCAGTGGCTGAGCTGAG 
GTGTGCATGGGGATGGGTAAGAGCCCACACACAGGGCTGATGACTGAGTC 
TATTTAGAACAATAGATGTAAAATCTGATAATGTAAAATGTGATAGATTA 
TTTTGTCAATTAGAAATGGTACCATATAATTATATATATACATAAACATG 
TATACATATACACACATATACATGTGTGTATAAACACACACAGTATTGTC 
CCCTACTCATTCCATAAACCTGATGCCTTTAGCTGGGATTCCCAGCTTTC 
ACTCTCCTCTCTGTCATCTGCTGTCTATATCCTCCCCATCCTGTAATTCT 
GGCTTATATGCCACTTCCTCCCTAAAGCCCTCCCTCAATCCCTTGCTGGA 
AGTGACATTTTCCTCTTTGAGCTGCCCCTGCTTGTGCTTTGGTGAGGTCA 
GCTGTATTGCAGTACCTTGTATTGTGGTTGTCACATCATCGTATAGAATT 
AATTTCTGACACATTCCGTATTTTTCAAAGGGCCTAGTGTGGGGCTTTAA 
CAGTAACTACGCCACCACGCCCAGTTAATTTTTTGTATTTTTGGTGGAGA 
CAAGGTTTCACCATGTTGGCCGGGCTGGTTTCGAACTCCTGACTTCAGGT 
GATCTGTCTGCCTCAGCCTCCTGGAGTGCTAGGATTGCAGGCATGAGCCA 
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GTGGTGTCACTGATGTGTCTGATGTTTAGTTGTAATTATTTGCTGGGCCC 

CTGTCATCCCTCATAXCTGATAGCTCTTTGCTAGTCAAAGTGTGGTCTGG 

GGATCAGCGGCATCAGCATCACTTGAGAACTTGTTAGAGATGCAGAATCT 

AGAGCCCCACCCGGGACCCAGAAACAGAGCCTGCATTTTAACAAGCTCCC 

CAGGTGATTCTCACACACACTCGCATTTGAGAAGCACTGGGCTAGTTGAC 

AGATTCTCAGGCATGGCTGACATTGAAATATCCAGGGAGCAGGCTTGGCA 

TXAGGATGTTTAAAAGTCCTCCAGGTGTTTCTAAAGCCAGGTTTGAGGAA 

TTACTGGGCTGATACAAATGTTTTGTGATGATGCTTTGTGTGTGTGTGTG 

TGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTAGGGAATTC 

TGGGTCACTTGGCACCAACACAGGAAACAATGGAAATATGTGAGCCATGA 

CAGAAAGGTCAGGAGATAAAAGAAATTAGTGACATGAGAGGTACTCCTCA 

GGTGTTAGGAAAGAGGGTAGAGCAAACCAGGTTTTCCACCATATGTTGGA 

TAGGGGGTCAAGTAAATTTCTACTTAAAAATTACAAACAGGGGCTGGGCG 

CGGTGGCTCATGCCTGTAATCCCGCACTTTGGGAGGCTGAGGAGGGCGGA 

TCACAAGGTCAAGAGATTGAGACCATCCTGGCCAACACGGTGAAACCGTG 

TCTCCACTAAAAATACAAAAATTAGCTGGGCATGGTGGTGCGTGCCTTTA 

TTCCCAGCTACTCGGGAGGCTGAGGCAGGAGAATCGCTTGAACCTGGGAG 

GTGGAGGTTGCAGTGGGCCGAGATCGCACCACTGCAATCCAGAGCGAGAC 

TGTGTCAAAAAAAAAAAAAAAAAGAAAATTCCAAACAGGATGACCCTAAG 

CCTGCAGGACTTGGAGACATCTAGGTGACTGATACTCAGTCACAAAACAT 

AAXTGGTCACAGGCCTGATGAAATGCACAGCAGACCTTCAGATGGTATGC 

ACTCAAGTGATATCCACAAGTCCACCTAAAGAAATGCTATATTCAGACAT 

TTGGCATCAATCTCTATCAAACAAAGATAGTCCAAAGCAATGGGTTCCAA 

AAACACTTTCCTAAGACAAATTCTCTATTTGCTTTTAATATCAGTCATCC 

CAGCCCTTGGAATAGAGGAGCAAATGATACCAGTGGTACCCTACCACAAT 

GCACCAAGGTATTATACTCTCATGCTCCATTTTCTCCCTCTGTCTACATC 

ACTAATAACTCATTGATTTCTGGTGCAAGCCCTGCTGGGAGAAAAAGTCT 

ACTCTTGTACCTTGGAGCAAGTTGCTCAGAGTAGGTATCGAGGATAAAAT 

TTGGAAAGTTAGAAAAGCTATTAGAAGGAGATCCTAGTAGTTGAAAACAC 

AGCCTGGCCAAGTCAATGATGCTATTTCATCTCCCCAGCCTTGCATGTCC 

ATAGCTAAGGAAGACAATTTAGGCTTGGGCTAGAGGATGGGAAAGGGCAA 

AATTACTGATGCCACAGCCCAGAGAGGTATTCTAGTAATCTGAGGGTGAG 

GACCACATACCTGGTTCAGGGACGTACAGTGTTGACAGCTGTGAGTGGAT 

GCCTGGAGTTCTGGCGTGTCTTCTAGCACAATGATACCTGAGACTCTTGC 

ATCATTGGGAATAATAAAATGGGAGTGGATAGATATGAAATTATGATGGC 

AATAAGCAATCAGCTAATAGCTTCATTGATGGGACAGATTAAAGATGGCT 

GCAAATCCTTTGGTCCAGGTTTGGGATATAGGCAGCATTTGTATTGGAAT 

3CTGATAGTCTGAGGCCATGAAAAGTCCACCTGCAGTAGTGGTAGGAGGA 

ACAAGCCTCACTTTCTTCAATGTGTGTGACTGCTGTCTTGATTCCCTGGG 

rGGCCAGTTCCATTCGTGTGGTTCTTTGGTCCACTTGACTCTGGGGTGGC 

TCTGTGATGGCTTGACCAATACAATGTAGTGGAAATGATGCTGTCATCAT 

TTCCAGCCTCTTCCAGCCTTAAGGAACTGGCAACTTTTATTTCTGTCCCT 

TGGAATACTTGTTCTTGCAACCCATCCATO^TACAGTGAGAAATTCTAAG 

CTGCCCCATTAAGAGGCCCACATGGTGATAAATTGGGGTCTTACATACAG 

CCCTAGCTGTGCTCCTAGCTGACAAACAGTAGCAACTTGTCACCAGGCGA 

GTGAACCACTTAGGACTGTATACTCCAGCCCCAGTTGAGCAATGTGGAAC 

AGAGTAAACCATCTCAGCTTAGCCCTGCCCAAACTGCAGAATTATGAGCA 

AAATAATCCCCTAGGCTTTGGGCTGATTTGTTCCAGATTACTGGAACAGA 

ATTTGGTACCAGGGGTGAGGTGCTACAGCAATGAAAGCTTAAGACACGTG 

ACTTTGGTTTTGGGTCTGAGTGGCAGGGGAACTTGGCAGGCCTCAAGGAA 

ACTTTTAGGGAGGGTTGAAGCATAGTGAGGAAAACAGTAGGGGAAGCTAG 

AGGAAAAAATGATGCTTGGTATGTAGTGGTGGGAAGTTTAGCAAAACTCG 

CCTGATGTAATGTGGGAAATTGTAAGAACTCAGAACGATTTAAGGGCATG 

TTTTATAGGTCCTTTAAGAAACTTCTAGGCCAGGCGCAGTGGCTCATGTC 

TGTAATCCCAGCACTTTGGGAGGCTGAGGTGGGCGGATCACAAGGTCAGG 

AGATCGAGACAATCCTGGCTAACATTGTGAAACCCCGTCTCTACTAAAAC 

TACAAAAAAAAATTAGCCGGGCATGGTGGCGGGTGCCTGTAGTCCCAGCT 

ACTAGGGAGGCTGAGGCAGAAGAATGGCGTGAACCTGGGATGTGGATCTT 

GAAGTGAGCCCAGATTGTGCCACTGCACTCCAGCCTGGGCAACAGAGTGA 

GACTCCGTCTCAAACCGAAAAAAAAAAAAAAAAAAAGAAACTTCTAGGGC 
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""GGTCCCGTGGAAGCCTCACACATGGTACACAAAGGCTGTCTTGAAAAGA 

^CGTAAGTGTGTTTrTTGGTTTAATAAAATTGATTATAAATGGATAATG 

^AAAACATTTTAAAGAATTTTACTAGCTTACATTAGCAGAmGGATCCA 

CTGATTGTTACATTCTOTTACTGAGCCCCTGAATTACTTCTCTGAGTAAG 

GCATTATACCAAAGCTATTGATAGTTGGGCTTATAGGGTGTATGTTTGAA 

GAACTACTAATGTCAAAACCAATATTTCACGGTCGACAAGAGGACATCAG 

AACTGGTAATCCTTATTACCATGACTGGCTGGACAGAATACTCAATGTAA 

TGGGAT^^CCTGCAAATAAAGACGGGGAAGATGTAAAAAAGATGCCTGAA 

CATTCAACATTAATGAAAGATTTCAGAAGAAATATGTATACTAACTGCAG 

CC-TATCAAGTATATGGAAAAACACAAAGTTAAACCAGATAGTAAAGCAT 

TCCACTTGCTTCAGAAGTTTCTTACTATGGACCCAATAAAGTGAATTACC 

^GAGAACGGGGTCCCTGTTTCTTCGAAGACCCACTTCCTACATCAGACGT 

~^CAAC^GTTGTCAAATCCCCTACCCAAAATGA(5AATTTTTAA^GAAG 

AAGAACCTGATGACAAAGGAGCCAAAAAGAACCACCACCGGCAGCAGGGC 

CATAACCACACGAATGGAACTGGCCACCCAGGAATCAAGACAACGGTCAC 

ACACAGGGACCCCCGTTGAAGAAAGTGAGGCTTGTTCCTCCTACCACTAC 

CTCAGGTGGACTTrTCACGGCCTCAGACTATCCGCGTTCCAATCCACATG 

CTGCCTATATCCCAACCCTGGACCAAGCACATCCCAGCCGAAGAGCAGTG 

^TCACAGATCGGGGTAGTGGCTTCCAGTTTGTACCTATTCTGGAGTTAG 



CATATGTCATGAATTCCTCCAGTGTAACAACATTATCTGAC^TAGTAC 
ACACACAGACACAAGGTTTAACTGGTACTTGAAAACATACAGTAGGTGCT 
t^^^TGAA^^CCAGGACTCAAAGTAAGATTATTTTGGTACACCTT 
OTG^I^^A^^TTGATTCATTTTCTACAWAATCAGT 

gotSgaccaagaatattgcttgg^ 



gStamtttctttttattttcaatctatcsacttgac^ 

TCAGCAAACAAACTAAAATGATTGTCAC^GACAATGCTTTATTTrrCCTC 



^CTGTGCTCTCTGAAACATGTGCTGTGTCCACTCAGGGTTAAATGGATT 

AAG^GCGGTGCAAGATCTGCTTTGTTAAACAGATGCTTGAAOT 

CTCGTAAG^GTCATCACCACTCCCTAATCTCAAGTACCCAGGGACACAAA 

rarTG^GAAGGCCGCAGGGACCTCTGCCTAGGAAAGCCAGGTATTGTCC 

t^c^^Stgtga^gtctgaaatatggcctcgtgggaggggaa 

g^ctagtaScgaggaaggaacgcctctttgcagttgagaqagaggaa 
SSctgtotctg^g^ctcsggcaatggaatgtctcggtata^ 
^^^a^t^tctactcjagataggggaaaaccaccttagotct 

r,GA^TGGGACATCCGGCAGCAATACTGCTCTTTAAGACATTGAGATC^ 

?S?ccS^Stgcca«tccccctctccgagamcacccaagaatg 
SSSaaaScta^^ 

ggtgcagcagatcaccatggcccacatatacctatttaacaaacctgcac 
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GTTCTGCAC^GTATCCCATTTCTTTTTTTTTTTAAGAAATAGAAAAAAA 
AATAAAATTrTGTTCACTGATTCTTCCATTTTAAAACTTGTTTGCATGTG 
GTTTAGGATGCCCTTACTTCAGCAAAGGAGAAGGAATAGGAGGGCCTTAG 
AATTTTTGAGGGAAAAAAACCCTATAACATACATTGTACTGTATCAAACT 
ATTTTACATGAATGACACAAGTATTCTGAATAAAAAAATAATTGAACATT 
GTTAAGAACAAGGTGTCATGTAATTTATTTTTCATAAATAAAAAAATTAT 
AGTGGCTTAGACTGAAAGGAACAGAGAATTTAAAAAATTAAAAAGAAGCC 
TTAGTATATTTTTGTATATAGTTTCCATGTGCCATATTTGCCATAATTGG 
ATGAGAATTTTTTGACCTCTGGCAGGGTGACCCTATATTTTCANTNTATA 
AAGCGTGCATCATACC 
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